As a Site Reliability Engineer, you will be responsible for driving the effort to identify, design, and develop the best technical and field solutions to automate our production systems. This position will collaborate often with various internal and external business and engineering teams. You will also have an opportunity to eventually lead efforts to champion and instill a culture of DevOps at Skyflow.
We know great Site Reliability Engineers come from diverse backgrounds so no single individual may have all the desired skills on day one. But if you are the kind of software engineer who would have loved to engineer solutions for Stripe or Twilio APIs, or the Slack or Zendesk app, or the Snowflake or MongoDB platform - we want to talk to you.
Desired Qualifications:
5+ overall years hands-on experience with 2+ years of experience in infrastructure automation and software delivery using DevOps practices
Familiarity with cloud platforms (e.g., AWS, Azure, GCP).
Coding experience with Go (preferred) or Python.
Experience with DevOps tools - CloudFormation/Terraform, Jenkins, Ansible and others
Hands-on experience with Linux Systems Engineering, Docker and Kubernetes container orchestration, RDBMS, and scripting for automation
Ability to work with distributed teams to provide technical guidance and leadership
Solid understanding of the common challenges with migrations and modernizations, the ability to choose the right path based on previous experience
Expertise with application observability patterns and site reliability practices
Extensive experience working with large distributed infrastructures
Responsibilities:
Utilize programming languages like Python and Go, Container Orchestration services including Docker and Kubernetes, CM tools including Terraform and Helm, and a variety of AWS tools and services on a daily basis
Develop and maintain CI/CD pipelines to enable automated testing, building, and deployment of applications.
Collaborate with cross-functional teams and clients to deliver robust cloud-based solutions that drive best-in-class experiences to Skyflow customers
Automate and maintain tools/systems involving software builds, continuous testing, automated deployments, software health monitoring and software releases
Evaluate reliability, performance, scalability, and engineering aspects to ensure a smooth software production rollout and delivery
Be a thought leader and key contributor within our DevOps team and help build a DevOps culture