i
Arting Digital
222 Arting Digital Jobs
Site Reliability Engineer - Kubernetes (3-8 yrs)
Arting Digital
posted 10d ago
Flexible timing
Key skills for the job
Job title : SRE (Kubernetes+ microservices)
Experience : 3+yr
Location : Bangalore
Mode : Hybrid
Skills : Kubernetes, container orchestration, mainly Cluster, microservices architecture, Helm charts.
Key Responsibilities :
1. Kubernetes Administration : Deploy, manage, and monitor Kubernetes clusters, ensuring high availability and performance.
2. Cluster Upgrades : Plan and execute upgrades for Kubernetes clusters and associated tools, minimizing downtime and ensuring smooth transitions.
3. Microservices Reliability : Work closely with development teams to ensure the reliability of microservices in terms of autoscaling, rollout strategies, integrations with secrets, and external components and support the teams with your strong concepts of computing resources such as CPU Requests and Limits, CPU Throttlings, etc.
4. Cloud-Native Tools : Implement and manage tools such as Istio Service Mesh, Argo CD, Prometheus, External-Secrets Operator, Keda, and others to enhance the functionality and reliability of our cloud platform.
5. Helm & Kustomize : Utilize Helm charts and Kustomize for Kubernetes resource management.
6. GitOps with Argo CD : Implement and manage GitOps workflows using Argo CD.
7. Troubleshooting : Diagnose and resolve complex issues on time.
8. On-Call Rotation : Participate in an on-call rotation to provide 24/7 support for critical systems.
Required Qualifications :
1. Proven experience with Kubernetes and container orchestration, mainly Cluster autoscaling, HPA, VPA, Liveness and Readiness probes, PDB, A_inities, etc.
2. Strong understanding of microservices architecture.
3. Experience with cloud-native tools such as Istio, Argo CD, Prometheus, ExternalSecrets Operator, Keda, Karpenter, etc
4. Proficiency with Helm charts.
5. Experience with GitOps workflows, specifically using Argo CD.
6. Experience with cloud platforms such as Azure and AWS.
7. Excellent spoken and written English skills.
8. Strong attention to detail.
9. Ability to perform complex troubleshooting.
10.Availability to participate in an on-call rotation.
Preferred Qualifications :
1. Knowledge of security best practices in a cloud environment.
2. Experience with scripting and automation using languages such as Python, Bash, or Go.
3. Ability to monitor and explore metrics and logs through NewRelic
Functional Areas: Software/Testing/Networking
Read full job description