5 Wealthy Jobs
5-7 years
Bangalore / Bengaluru
Wealthy - Site Reliability Engineer - Kubernetes (5-7 yrs)
Wealthy
posted 2mon ago
Job Description :
- Design, implement, and maintain reliable containerized applications using Kubernetes on GCP.
- Develop and optimize SLIs, SLOs, and SLAs for critical systems and services.
- Create and maintain automation for deployment, scaling, and management of applications and infrastructure.
- Implement and manage observability solutions, including monitoring, logging, and alerting systems.
- Conduct capacity planning and performance optimization for Kubernetes clusters and GCP resources.
- Collaborate with development teams to improve application reliability, scalability, and performance.
- Implement and maintain disaster recovery and business continuity plans.
- Continuously improve system reliability through chaos engineering and proactive testing.
- Optimize resource utilization and cost management within GCP.
- Implement security best practices for Kubernetes clusters and GCP services.
- Stay up-to-date with the latest developments in site reliability engineering, Kubernetes, and GCP services.
- Document processes, runbooks, and best practices for maintaining system reliability.
- Serve as the primary point of contact for interactions with Google Cloud support and other service providers.
- Collaborate with Google Cloud representatives to optimize our use of GCP services and stay informed about new features and best practices.
- Manage relationships with other cloud and service providers, ensuring optimal integration and utilization of their services.
- Rapidly learn and adapt to new technologies and tools as needed.
- Proactively explore and experiment with new technologies and methodologies to improve system reliability and efficiency.
- Implement and manage GitOps workflows using ArgoCD for Kubernetes deployments.
- Design, develop, and maintain Helm charts for streamlined application deployments.
Requirements :
- Strong experience with Kubernetes, including deployment, scaling, and management of containerized applications.
- Extensive knowledge of Google Cloud Platform (GCP) services and best practices.
- Solid understanding of containerization technologies, particularly Docker.
- Experience with monitoring and observability tools (e.g, Prometheus, Grafana, Alerting).
- Strong knowledge of version control systems, particularly Git.
- Deep understanding of networking concepts, including VPNs, VPCs, and gateways.
- Familiarity with database administration, particularly cloud-native database solutions.
- Strong analytical skills with a focus on system reliability and performance optimization.
- Excellent communication skills, including the ability to effectively interact with external service providers and translate technical concepts for non-technical stakeholders.
- Experience in vendor management or working closely with cloud service providers.
- Ability to work in a collaborative team environment.
- Demonstrated ability to quickly learn and adapt to new technologies and tools.
- Familiarity with CI/CD concepts and ability to work with various CI/CD tools.
- Natural curiosity and a tinkerer's mindset, with a passion for understanding how systems work at a deep level.
- History of personal projects or contributions to open-source projects (preferred).
- Ability to think creatively and approach problems from multiple angles.
- Proficiency with ArgoCD and GitOps principles for managing Kubernetes deployments.
- Strong experience with Helm for packaging and deploying Kubernetes applications.
- Certifications such as Certified Kubernetes Administrator (CKA) or Google Cloud Professional DevOps Engineer are a plus.
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Site Reliability Engineer roles with real interview advice
5-7 Yrs
Bangalore / Bengaluru