As a Cloud Engineer, you will contribute to building a highly available, global, multi-cloud PaaS platform using open-source technologies to support Simplismart s rapid growth. This system encompasses diverse environments (Kubernetes, VMs, bare metal compute) and provides a cohesive and reliable abstraction for running AI workloads. You will be able to work with cutting-edge technologies and solve complex problems.
To be successful in this role, you need to be deeply technical, possess strong communication and collaboration skills, and have experience in infrastructure-as-code. Proficiency with tools like Terraform and Ansible and strong software development fundamentals is essential. Additionally, you should have a good understanding of systems knowledge and troubleshooting abilities.
Requirements:
5+ years of experience writing high-performance, well-tested, production-quality code and platform engineering.
Proficiency in at least one backend programming language (Python desired; C++ is a plus)
Demonstrated experience with high-performance or distributed cloud microservices architectures.
Ideally, you should have experience building and operating globally using multiple cloud providers such as AWS, Azure, or GCP.
A good understanding of low-level operating systems concepts, including multi-threading, memory management, networking and storage, performance, and scale.
Pragmatic, methodical, well-organized, detail-oriented, and self-starting.
Experience with Kubernetes, containerization, Terraform and Ansible.
Experience with Pytorch or Tensorflow is a plus. (not necessary)
Knowledge of GPU programming, NCCL and CUDA is a plus.
Responsibilities:
Designing the high-level architecture of the MLOps platform from the ground up.
Handling formalisation of diverse GPU-based workloads.
Developing a robust internal system for continuous deployment of various services and modules in diverse environments.
Create frameworks for reliable and fault tolerant systems for mission-critical workloads.
Skills and Attributes:
Deep technical expertise.
Strong communication and collaboration skills.
Experience in infrastructure-as-code (Terraform, Ansible).
Strong software development fundamentals.
Good systems knowledge and troubleshooting abilities.
Ability to work independently and as part of a team.
Proactive and self-motivated.
Why should you join SimpliSmart
Legacy System Headaches: You wont have to endlessly grapple with outdated legacy systems that hinder your productivity and creativity.
Bossy Culture: At SimpliSmart, we believe in collaboration and empowerment, not hierarchy. You wont have a boss breathing down your neck but instead, colleagues who support your growth.
Dark Circles: Late nights and overwork are not the norm here. We prioritize work-life balance, ensuring you wont be sporting those tired, dark circles under your eyes.
Stagnation: Say goodbye to redundant and stagnant tasks. We thrive on innovation and dynamic challenges that keep you engaged and motivated.