As a Site Reliability Engineer (SRE) at Aviato, you'll be the guardian of our infrastructure and applications, ensuring seamless operation and optimal performance.
With a focus on Kubernetes and cloud technologies, you'll collaborate with diverse teams to design, implement, and maintain scalable, secure, and highly reliable systems.
This role offers exciting opportunities to deepen your expertise and become a recognized subject matter expert.
Key Responsibilities: System Reliability: Collaborate with development teams to ensure reliability is embedded throughout the software development lifecycle.
Design and implement resilient and scalable architectures for critical applications.
Kubernetes Operations: Design, implement, and manage Kubernetes clusters, ensuring high availability, fault tolerance, and scalability.
Maintain Kubernetes infrastructure with regular updates, patches, and security enhancements.
Automation & Infrastructure as Code (IaC): Automate deployment, scaling, and management of Kubernetes and cloud resources.
Implement CI/CD pipelines for efficient application deployment and updates.
Develop and maintain IaC scripts using tools like Terraform and Ansible.
Monitoring & Alerting: Implement monitoring solutions for Kubernetes clusters and applications.
Proactively identify and resolve performance bottlenecks and reliability issues.
Incident Response: Respond to incidents, minimizing downtime and performing post-incident analysis to prevent recurrence.
Capacity Planning: Plan and optimize infrastructure capacity to support current and future workloads.
Security: Collaborate with security teams to enforce best practices in Kubernetes environments.
Conduct security audits and vulnerability assessments.
Collaboration & Documentation: Collaborate effectively with development, operations, and security teams.
Maintain comprehensive documentation for system configurations, troubleshooting, and processes.
Proficiency in cloud platforms (especially GCP) and cloud-native applications.
Experience with IaC tools (e g , Terraform, Ansible).
Solid understanding of languages like Python, Bash, Go, or Java.
Proven experience designing and implementing CI/CD pipelines.
Strong analytical and troubleshooting skills for complex infrastructure and application issues.
Excellent communication and collaboration skills.
Technologies & Tools: Ansible.
Apigee.
Bamboo.
Dynatrace.
Git.
Google Cloud (GCP).
Grafana.
Kubernetes, Docker.
Jenkins, GitHub Actions, Bitbucket Pipelines.
Sentry.
Terraform.
JIRA, ServiceNow.
Why Join Aviato.
At Aviato, we are redefining the way technology is delivered.
Founded by ex-Googlers we want to bring the Google culture to Aviato so we are a forward-thinking, fast-growing company that values transparency, collaboration, and doing the right thing.
If you're passionate about building world-class engineering solutions in cloud technologies and Kubernetes, we invite you to join our team and grow with us!.