ABOUT THE COMPANY:
Nymbus (https://nymbus.
com/) isn't just a leader in fintech; we're a community of innovators passionate about reimagining banking.
Our award-winning modern core platform and cloud-based technology serve as the backbone for financial institutions eager to modernize and excel.
Here, you won't just be part of a tech revolution; you'll be at the helm, driving change.
You'll fit right in if you're a creative thinker eager to lessen technical debt and elevate agility for banks and credit unions.
Our culture thrives on collaboration, integrity, and a client-first approach.
Your journey with us won't simply advance your career; it will offer the chance to shape an industry alongside like-minded professionals.
We're excited to consider you a key player in this transformative chapter.
Thank you for contemplating a role with Nymbus.
JOB SUMMARY:
We seek an experienced Site Reliability Engineer to join our Network Operations Center team, with a specialized focus on security patching and software monitoring, observability, and alerting using Datadog.
In this multifaceted role, you will ensure our network and software applications' high availability, performance, and security.
You will leverage your deep understanding of network operations, cybersecurity, and Datadog's comprehensive monitoring tools to identify and mitigate issues preemptively, ensuring a seamless user experience.
This position requires flexibility in working hours, including occasional off-hours and weekends, to support our operations and meet project deadlines.
We value work-life balance and offer flexible scheduling to accommodate these demands.
RESPONSIBILITIES:
Implement and manage comprehensive monitoring, observability, and alerting strategies using Datadog for real-time insights into the performance and health of our software applications and network infrastructure.
Proactively monitor system performance, identify potential issues, and execute troubleshooting and resolution to minimize downtime and service disruptions.
Develop and maintain a robust security patching program, ensuring all network devices, servers, and applications are regularly updated to protect against vulnerabilities and cyber threats.
Collaborate with development and operations teams to enhance system reliability by adopting SRE best practices and Datadog's monitoring capabilities.
Customize Datadog dashboards and alerts to meet the specific needs of our operations, ensuring critical issues are promptly identified and addressed.
Automate routine patching, monitoring, and maintenance tasks to improve operational efficiency and accuracy.
Participate in incident response and post-mortem analysis, utilizing Datadog data to identify root causes and implement preventive measures.
Keep abreast of the latest trends and technologies in SRE and monitoring tools, particularly Datadog's evolving features and capabilities.
QUALIFICATIONS:
Bachelor's degree in Computer Science, Information Technology, Cybersecurity, or a related field.
Linux administration experience and system patch management is a requirement.
3+ years of experience in a site reliability engineering, system administration, or similar role, with specific experience in network operations and cybersecurity.
Proven expertise in using Datadog for monitoring, observability, and alerting in a complex network and software environment.
Strong understanding of network protocols, infrastructure, and security patching strategies.
Proficiency in scripting and automation tools (e.
g.
, Python, Bash, Ansible).
With cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).
Proficiency in Java.
Excellent problem-solving skills, with the ability to work under pressure and manage multiple priorities.
Strong communication and collaboration skills, capable of working effectively with cross-functional teams.
PREFERRED SKILLS:
Certifications in Datadog, cloud computing, cybersecurity (e.
g.
, Datadog Pro, AWS Certified Solutions Architect, CISSP).
Experience with DevOps and CI/CD pipelines.
Knowledge of additional monitoring and observability tools.
SALARY & BENEFITS:
$100,000 – $145,000 Annual Salary
Annual Cash Bonus and Equity Options commensurate with the role level and experience
100% Fully Remote
Robust 401(k) plan with company match
Insurance - Health, Dental and Vision (Nymbus covers 100% of the Healthcare and Basic Dental premiums)
Flexible Paid Time Off
Ready to join? We invite you to watch this video and learn who we are and how we build and innovates together!
Let's Go!