17 Gateway Search Jobs
Site Reliability Engineer - Incident Management (6-8 yrs)
Gateway Search
posted 20d ago
Key skills for the job
About the job :
Hiring for a MNC client which provides software as a service products related to customer support, sales, and other customer communications.
The company was founded in Denmark in 2007.
It has over 100,000 customers and 5000+ global employees.
Currently hiring for a new Product Development Center of Excellence in Pune.
As an early hire, you will have a unique opportunity to be a pivotal part of this journey.
Role : Site Reliability Engineer
Location- Pune (Hybrid)
Description :
- Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.
- Within APAC SRE, our focus is on engineering quality solutions that solve our reliability and observability needs.
- Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
What you'll be doing
- Engage in and improve the whole lifecycle of service, from inception and design, through to deployment, operation and refinement.
- Provide and institute proven practices around reliability, remediations, and incident management
- Build vital and efficient tooling to lower the barrier of entrance for engineering teams to plug in and enjoy the benefits of Reliability
- Provide guidance to engineering teams reliability standards to help proactively identify and resolve issues before they become incidents
- Rapidly understand issues with our products and help teams to optimally solve customer-impacting incidents
- Architect highly reliable production-grade platforms that enhance the reliability ideology
- Analyze the shortcomings of existing systems and propose alternatives
Required experience :
- 6+ years as Site Reliability Engineer / Software Engineer with focus on prevention, observability and improvement in a project
- The ability to solve architectural problems is a must
- 6+ years of experience in coding with any language (Python / Java/ Go / Scala etc)
- 5+ years of experience in system design is a must
- Experience architecting, improving, and operating reliable distributed platforms
- Experience in setting up frameworks where services are more reliable in creating automation process testing
- Knowledge with AWS, infrastructure, cloud-native software design, backend systems, and Kubernetes
- Experience building for and supporting a polyglot programming and datastore environment
- Enjoyment from helping others learn and improve through pairing, code review, tech talks, etc.
Functional Areas: Software/Testing/Networking
Read full job description