i
Sandhata Technologies
4 Sandhata Technologies Developer Jobs
API SRE Developer
Sandhata Technologies
posted 3d ago
Flexible timing
Key skills for the job
on AWS using EKS (Elastic Kubernetes Service). - Develop, maintain, and enhance monitoring and alerting systems using Datadog to proactively identify and address potential issues, ensuring optimal system performance. - Participate in the design and implementation of CI/CD pipelines using Azure DevOps, enabling automated and reliable software delivery. - Lead efforts in incident response and troubleshooting to quickly diagnose and resolve production incidents, minimizing downtime and impact on users. - Take ownership of reliability initiatives by identifying areas for improvement, conducting root cause analysis, and implementing solutions to prevent recurrence of incidents. - Collaborate with cross-functional teams to ensure security, compliance, and performance standards are met throughout the development lifecycle. - Participate in on-call rotations and provide 24/7 support for critical incidents, ensuring rapid response and resolution. - Work with the development teams to define and establish Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and maintain the systems reliability. - Contribute to the documentation of processes, procedures, and best practices to enhance knowledge sharing within the team. Qualifications: - Bachelors degree in Computer Science, Information Technology, or a related field, or equivalent work experience. - Minimum of 4 years of experience in a Site Reliability Engineer or similar role, managing cloud-based infrastructure on AWS with EKS. - Strong expertise in AWS services, especially EKS, including cluster provisioning, scaling, and management. - Proficiency in using monitoring and observability tools, with hands-on experience in Datadog or similar tools for tracking system performance and generating meaningful alerts. - Experience in implementing CI/CD pipelines using Azure DevOps or similar tools to automate software deployment and testing. - Solid understanding of containerization and orchestration technologies (e.g., Docker, Kubernetes) and their role in modern application architectures. - Excellent troubleshooting skills and the ability to analyze complex issues, determine root causes, and implement effective solutions. - Strong scripting and automation skills (Python, Bash, etc.). - Familiarity with infrastructure as code (IaC) tools such as Terraform or CloudFormation. - Experience with incident management, post-incident analysis, and implementing improvements based on lessons learned. - Good understanding of security best practices and compliance standards in cloud environments. - Exceptional communication skills and the ability to collaborate effectively with cross-functional teams. - Willingness to participate in on-call rotations and provide off-hours support when necessary. Preferred: - Relevant certifications such as AWS Certified DevOps Engineer, AWS Certified SRE, or Kubernetes certifications. - Experience with other cloud platforms (e.g., Azure, Google Cloud Platform). - Familiarity with microservices architecture and service mesh technologies. - Prior experience with application performance tuning and optimization.
Employment Type: Full Time, Permanent
Read full job descriptionPrepare for Developer roles with real interview advice