51 Eli Lilly and Company Jobs
Lead Site Reliability Engineer
Eli Lilly and Company
posted 3d ago
Flexible timing
Key skills for the job
The Software Product Engineering organization (SPE) delivers innovative tech solutions to aid, accelerate, and support work done across Lilly. This role is targeted for a DevOps engineer who enjoys working with a cross-functional team, developing robust infrastructure and code in support of accelerating business processes, and thinking innovatively. This position will report to the Team Lead - Software Configuration & Development and will partner with individuals from across TechLilly organizations to deliver solutions and enable tech supporting a wide range of software and business processes.
Qualifications:
Bachelors or Masters degree in Computer Science, Engineering, or related field
8 to 13 years relevant experience.
Must-Have Skills and Experience
Cloud Management: Utilize extensive AWS knowledge to manage and optimize cloud-based solutions, ensuring scalability, security, and cost-effectiveness.
Container Orchestration: Deploy and manage Kubernetes clusters to streamline the deployment and scaling of applications. Leverage Kubernetes capabilities to enhance the reliability and agility of our services.
DevOps Practices: Implement and advocate for best DevOps practices, including continuous integration (CI), continuous delivery (CD), and infrastructure as code, to improve the efficiency and reliability of the development lifecycle. Manage code repositories using GitHub and automate workflows with GitHub Actions. Ensure seamless integration and deployment processes through automated pipelines.
Monitoring and Observability: Monitor application performance, detect anomalies, and ensure system health using tools like Splunk, AppDynamics, Datadog, New Relic, and open-source tools like Prometheus, Grafana, and Jaeger.
Scripting and Automation: Develop and maintain scripts using Python or any other scripting languages to automate routine tasks, enhance monitoring, and improve system performance and reliability.
Experience in L1 & L2 Support: Provide expert-level support for applications, ensuring timely and efficient resolution of issues. Ability to handle first and second-level support issues efficiently. Maintain and enhance the stability and performance of applications across various environments. Apply ITIL knowledge to streamline processes and improve service management.
Incident and Problem Management: Troubleshoot incidents, and problems to ensure seamless operations. Quickly resolve technical issues and identify root causes to prevent future incidents.
Documentation Skills : Keeping thorough records of issues, fixes, and maintenance tasks for future reference. Ability to create and maintain detailed Runbooks.
Experience with disaster recovery (DR) and business continuity planning (BCP).
Expertise in creating and maintaining SNOW dashboards and reporting.
Proven track record of applying Site Reliability Engineering (SRE) principles.
Knowledge of Tomcat or Any other application servers.
Knowledge of Linux and shell scripting
Effective prioritization skills considering urgency and business impact.
Proactive mindset in addressing and resolving issues.
Exhibit a strong sense of ownership and accountability for all tasks and responsibilities.
Work as an engineer specializing in Kubernetes and Amazon Web Services on a team of full stack software developers to develop and maintain software platforms and DevOps processes
Guide and collaborate with internal application teams to deploy solutions on a custom Cloud Deployment Platform
Improve and maintain reusable pipeline templates and patterns for automated deployment of cloud infrastructure and code
Develop and support high-quality automation workflows inside and outside the cloud platform that are appropriate for business and technology strategies
Monitor and troubleshoot the software delivery process
Work with software developers and operations engineers to improve the software delivery process
Stay up to date on the latest DevOps and ITIL practices and technologies
Strive to provide internal customers with excellent customer service
Effectively contribute to the communication of platform health, risks, and issues to the program partners, stakeholders, and management teams
Resolve most conflicts between prioritization and scope independently but intuitively raise complex or consequential issues to senior management
Be a self-starter, able to come up with solutions to problems and complete those solutions while coordinating with other teams
Work in a modern Agile/Kanban environment to deliver customer value with regular cadence
Experience working with teams across organizational and geographic boundaries and multiple levels within the organization
Excellent proactive oral and written communication skills
Experience in multiple common tech languages
Good-to-Have Skills and Certifications
ITIL Foundation certification.
AWS or any other relevant Cloud Certification(s).
Knowledge of SQL.
Knowledge of Postman
Familiarity with MuleSoft and API integrations.
Understanding of networking concepts.
Experience in the Pharmaceutical domain.
Employment Type: Full Time, Permanent
Read full job descriptionPrepare for Site Reliability Engineer Lead roles with real interview advice
5-10 Yrs
Bangalore / Bengaluru
3-5 Yrs
Panipat, Yamunanagar, Faridabad +3 more
4-9 Yrs
Bangalore / Bengaluru
5-10 Yrs
Bangalore / Bengaluru
3-5 Yrs
Bangalore / Bengaluru