Join the future of API security with a company founded by serial innovators reshaping the software industry. Visionaries Jyoti Bansal and Sanjay Nagaraj, creators of AppDynamics (acquired by Cisco for $3.7B), established Traceable with a bold ambition: to become the global leader in API security.
Were experiencing explosive growth, tripling revenue and scaling rapidly to empower enterprises facing evolving API threats. This success is fueled by a winning strategy: unwavering customer obsession, relentless product innovation, and strategic partnerships - all backed by the entrepreneurial expertise behind past industry-defining success. Our cutting-edge solution makes API security manageable for businesses across the globe, ensuring APIs drive growth, not risk.
Join this winning team and make your mark!
About You
As Manager/Sr Manager/Leader of Site Reliability Engineering (SRE), you would be responsible for leading a team of SREs in designing, building, and maintaining reliable modern large scale cloud-based infrastructure. This role involves optimizing system performance, ensuring high availability, and enhancing the security of the cloud environments. As a leader you would be closely working on development and operations to drive improvements in operational efficiency and establish best practices for cloud infrastructure management. The ideal candidate has extensive experience in building and operationalising modern large scale infrastructure with Kubernates, kafka, data systems, cloud etc, strong leadership skills, and a deep understanding of modern SRE principles.
On the SRE team, you ll have the opportunity to manage the complex challenges of scale and fast growth which are unique to Traceable, while using your expertise in coding, algorithms, problem-solving, and SRE practices. We keep Traceable applications up and running, ensuring our customers have the best and most reliable experience possible
About Role
Ensure reliability of cloud-based distributed systems infrastructure & services built to seamlessly scale to 10s of billions of events per day.
Responsible for the availability, performance, monitoring, emergency response, and capacity planning of the Traceable cloud services & infrastructure.
Responsible for building and maintaining ultra modern infrastructure for CI/CD and DevOp.
Responsible for debugging and solving production issues & escalations working with rest of engineering team
Collaborated with product engineering teams across time zones on design and operations of systems and services.
Lead, mentor, and manage a team of Site Reliability Engineers to ensure optimal performance and career growth. Establish team goals and objectives aligned with the company s strategic vision. Foster a culture of continuous improvement, collaboration, and innovation within the SRE team.
Qualifications
Bachelor s or Master s degree in computer science
10+ years of work experience in SRE & DevOps with modern cloud native tech stack, distributed systems at massively large scale
Strong experience with cloud native technologies (AWS/GCP, microservices Containers, Kubernetes etc) at scale
Strong experience in streaming systems like Kafka streams or Flink
Hands-on experience in setting up, automating and continuously improving the deployment pipelines and CI/CD infrastructure
Strong experience with linux systems
Strong experience of operationalizing and scaling modern data systems like MongoDB, Apache Pinot, Apache Trino, Spark, Apache Iceberg and Kafka Streams
Strong Experience in infrastructure deployment/provisioning as code using modern tools (Terraform, Helm, Ansible etc)
Good expertise in Java & Scripting
Strong troubleshooting & debugging skills for production issues & escalations
Experience working in a distributed team with different time zones
A self starter with the ability to work effectively in teams and fast faced start set-up
Excellent spoken / written communication
Nice to have
Information security experience for modern SaaS companies in Application security, Cloud/Infrastructure security and Shift-left security will be a plus
We value diversity and treatment of employees and applicants is based on merit, talent and qualification. We encourage people from underrepresented groups to apply. We believe the key to success is bringing together unique perspectives and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Direct applicants only. Recruiting agencies: Please do not email or call our team. We are not accepting agency candidates.
We believe the key to success is bringing together unique perspectives and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
For qualified US: applicants with criminal histories, consideration will be consistent with the requirements of the San Francisco Fair Chance Ordinance. All your information will be kept confidential according to EEO guidelines.