Position Overview We are seeking a skilled and motivated Site Reliability Engineer with hands-on expertise in application operations, DevOps tools, and SRE principles. The ideal candidate will have experience in supporting production systems, DEVOPS hands-on, a solid understanding of observability, and a foundational grasp of SRE principles. The role also requires basic to intermediate programming skills and familiarity with modern development practices.
Key Responsibilities Provide production support for Production applications, ensuring the stability and availability of systems. Diagnose, troubleshoot, and resolve production issues in real time. Design, implement, and maintain CI/CD pipelines using Jenkins, Git/Bitbucket, and other DevOps tools. Manage and deploy applications using containerization and orchestration tools like Docker and Kubernetes. Set up and maintain observability tools (Grafana, Prometheus, Instana) for monitoring, logging, and alerting. Write and maintain infrastructure as code using Terraform. Collaborate with development teams to implement SRE practices and principles, ensuring reliability, scalability, and performance. Assist in incident management and post-mortem analysis to improve system reliability. Contribute to the automation of repetitive tasks and system processes.
Must-Have Skills Production Support Expertise: Experience in application operations, troubleshooting, and system monitoring. DevOps Tools and Platforms: o CI/CD: Jenkins, Git/Bitbucket. o Virtualization & Orchestration: Docker, Kubernetes. o Observability: Grafana, Prometheus, Instana. o Infrastructure as Code: Terraform. Micro-frontend and Micro servics architecture Programming Skills: Basic to intermediate proficiency in one or more languages such as Angular, TypeScript, Python, or Node.js. SRE Principles: Understanding of Site Reliability Engineering practices to support reliability and performance goals.
Good-to-Have Skills Experience with AWS cloud services. Knowledge of NGINX web server and reverse proxy configurations. Familiarity with Kafka for event streaming. Hands-on experience with OpenSearch and Kibana for advanced analytics and visualization.
Qualifications Bachelors degree in Computer Science, Engineering, or a related field, or equivalent practical experience. 3+ years of experience in a similar role involving production support and DevOps. Strong problem-solving and analytical skills with the ability to work under pressure. Excellent communication and teamwork abilities.