As a Site Reliability Engineer (SRE) for our large and regionally distributed SaaS platform, your primary responsibilities will be to improve the reliability and availability of our mission-critical cloud-based services.
How will you make an impact
Essential Duties and Responsibilities
Observability and Monitoring
Create new dashboards and metrics to provide comprehensive observability into the health and performance of development teams applications, including SLI/SLO metrics.
Work with development teams to ensure proper monitoring is set up and enabled for their services.
Identify evolutionary improvements to the observability and monitoring solutions.
Reliability Consulting and Automation
Consult with development teams on SRE services and best practices to help them improve the reliability of their applications.
Create automation and tooling to reduce toil and manual intervention.
Incident and Problem Management
Assist other teams in data and performance analysis to identify the root causes of issues and recommend automation actions.
Knowledge Sharing and Mentoring
Review the work of other SREs and provide training and guidance to help them improve their skills.
Communicate effectively with both technical and non-technical peers and customers.
Process and Documentation
Follow established processes when performing work or help document and create processes, as necessary.
Document troubleshooting steps and results in appropriate locations for historical access.
Ensure compliance with policies, procedures, and standards.
Implement or coordinate remediation required by audits and assessments, and document, as necessary.
Time Estimation
Estimate the time required to complete activities and projects.
Have you got what it takes
4+ years programming/scripting experience with any of the following (Go, Python, .Net (C#), Node)
4+ years of experience working within public or private cloud environments
4+ years of SRE/DevOps/Observability or related experience
4+ years of AWS
Experience with Agile, Jira, GitHub, monitoring, automation, dashboarding
Join an ever-growing, market disrupting, global company where the teams - comprised of the best of the best - work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NICE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NICEr!
Enjoy NICE-FLEX!
At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere.
About NICE
NICE Ltd. (NASDAQ NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NICE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.
Known as an innovation powerhouse that excels in AI, cloud and digital, NICE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.
Requisition ID5258 Reporting into Manager, Cloud Operations Role Type Individual Contributor
NICE Ltd. (NASDAQ NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NICE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.
Known as an innovation powerhouse that excels in AI, cloud and digital, NICE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.