i
Infosys
2740 Infosys Jobs
App Site Reliability Engineer
Infosys
posted 8hr ago
Flexible timing
Key skills for the job
A day in the life of an Infoscion As a Senior Site Reliability Engineer, you will play a critical role in supporting application developers by providing expert guidance on Application and infrastructure best practices from reliability perspective. Improve reliability, quality, and time-to-market of our suite of products/applications. Define suitable metrics for system with SLO/SLI and setup observability mechanism to track it Define error budget as per the SLO Define strategy and setup up High Availability and Load Balancer based architecture Drive a metrics-driven culture and software delivery process using data to measure overall system quality and reliability. Balance feature development speed and reliability with well-defined service level objectives Provide primary operational support and engineering for products/applications Partner with solution architect and development teams to improve services reliability Participate in system design Participate in optimizing code, automating operational tasks and toil reduction Provide solutions for performance management, monitoring and observability Work with business users to understand issues, develop root cause analysis and work with the development team for enhancements/fixes Working on distributed traces to visualize the entire workflow and analyze the cause of problems/incidents Improve security and performance of applications Define, evangelize, and maintain SRE best practices Solutionize and implement DevSecOps best practices Improve automation including system s self-healing capability Manage and participate in on-call incidents, if required (Priority Incident) If you think you fit right in to help our clients navigate their next in their digital transformation journey, this is the place for you! Must have at least 5+ years of SRE experience in large programs with focus on release engineering, observability tasks and reliability Reliability practices Chaos engineering Strong experience on one or more Observability tools like New Relic, AppDynamics, Prometheus, Dynatrace, DataDog, Splunk, Experience in event correlation using observability or other tools like BigPanda Experience in Observability Dashboard creation, custom metrics, Synthetic Monitoring and Real User Monitoring (RUM) Good experience in scripting or development languages, including expertise in Python, Ruby, JSON, Java, and Node.JS, PHP (anyone) Experience with scripting in PowerShell(M) and Bash/Shell/Perl (anyone) Strong knowledge of application design and architecture including microservices architecture Experience in CICD tooling and best practices Experience of Cloud platforms such as AWS, Azure, and Google AIOps and related tools Experience in container orchestration and practices, including Kubernetes, Docker Swarm Experience in infrastructure automation tools like Terraform, Cloud Formation, Ansible, and Puppet (Any one) Knowledge on SQL, NoSQL (Oracle, Couchbase) Experience working on ITSM tools like Remedy, ServiceNow, Confluence, Jira Experience with Cloud cost optimization / FinOps
Employment Type: Full Time, Permanent
Read full job descriptionPrepare for Site Reliability Engineer roles with real interview advice
1-6 Yrs
Hyderabad / Secunderabad, Chennai, Bangalore / Bengaluru
5-10 Yrs
Noida, Coimbatore, Bangalore / Bengaluru
4-9 Yrs
Hyderabad / Secunderabad, Pune, Bangalore / Bengaluru
3-8 Yrs
Hyderabad / Secunderabad, Pune, Bangalore / Bengaluru