-Implementing SLOs and SLIs (service level objectives and indicators) following SRE best practicesScaling out observability and related tooling for logs, metrics and traces and onboarding all teams onto our chosen tools.
-Designing and developing sophisticated DevOps solutions to support the deployment and operation of our Large Language models (LLMs) and customer third-party CRM integration experience.
-Creating a robust and seamless CI/CD pipeline to ensure smooth and efficient deployment processes.
-Scaling cloud infrastructure with modern technologies and frameworks.
-Pioneering our infrastructure as code (IaC) framework.
-Ensuring robustness and scalability by leading best practices for monitoring, security, and reliability in DevOps.
-Working towards SOC-II type II by helping to implement SOC-II requirements, audit, and necessary practices.
-Collaborating with cross-functional teams to define and implement infrastructure requirements for the platform.
-Providing technical guidance and mentoring to junior and senior team members, including engineers based across the world in India.
-Working closely with our product and engineering teams to align with customer asks and solve operational problems.
Your Experience Looks Like:
-Bachelors or masters degree in computer science or a related field.
-Minimum of 10 years of experience in DevOps and cloud infrastructure.
-Excellent English communication skills and experience collaborating with customer-facing product teams.
-Experience leading or managing technical teamsStrong knowledge of the AWS cloud platformStrong proficiency in the following DevOps tools: Docker, Kubernetes, Github Actions, Terraform.
-Experience with observability platforms and best practices, such as New Relic, Grafana, and Open Telemetry.
-Experience setting up RBAC, security and networking for mid size teams of engineers.
-Has gone through a SOC-II audit in the past
-Extensive experience with infrastructure as code (IaC) and automation
-Experience working with LLMs and supporting LLM architectures at scale
-Excellent problem-solving and debugging skills
-Startup experience and a scrappy/flexible mindset