i
CoinDCX
Engineering Manager - DevOps
CoinDCX
posted 9hr ago
Flexible timing
Key skills for the job
The CoinDCX journey: building tomorrow, today At CoinDCX, we believe CHANGE STARTS TOGETHER . You are the driving force that will help us make Web3 accessible to all. In the last six years, we have skyrocketed from being India s first crypto unicorn to carrying a community of over 125 million with us. To continue maximising the adoption and acceleration of Web3, we are now focused on developing cutting-edge products, addressing accessibility and security challenges, and bridging the gap between people and Web3 technologies. While we go ahead and keep dominating the Web3 world, we would like to HODL you on our team! Join our team of passionate innovators who are breaking barriers and building the future of Web3. Together, we will make the complex simple, the inaccessible accessible, and the impossible possible. Boost your innovation to an ALL TIME HIGH with us! You need to be a HODLer of these * 10+ years of experience in DevOps with proven expertise in cloud environments (AWS, GCP, or Azure), Kubernetes, and container orchestration. * Extensive experience in CI/CD pipeline design and implementation using tools like Jenkins, ArgoCD, or GitHub actions. * Deep knowledge of IaC tools (Terraform, CloudFormation, pulumi) and GitOps methodologies. * Proven experience designing ultra low latency systems including network optimization, load balancing, and proximity computing. * Proficiency in scripting languages such as Python, Bash, or Go for automation tasks. * Experience with high-throughput, low-latency systems and real-time trading platforms. * Strong understanding of network and container security, encryption, and security compliance in regulated industries. * Familiarity with blockchain and crypto-specific security concerns is a plus. * 3+ years of experience in a leadership role managing DevOps team. * Proven ability to manage multiple projects in a fast-paced, high-stakes environment. * Excellent communication, collaboration, and problem-solving skills. * Experience in the crypto or fintech sectors is highly desirable. * A passion for cryptocurrencies and blockchain technology is a plus. You will be mining through these tasks * Infrastructure Management & Scalability: * Design, deploy, and manage high-availability, low-latency cloud infrastructure on AWS (or GCP/Azure) tailored to the demands of a high-volume crypto trading platform. * Architect and optimize network paths and data flows to achieve ultra low latency - including direct interconnects, proximity to exchange gateways, and optimized caching strategies. * Oversee the management of Kubernetes clusters (EKS or equivalent) and container orchestration to support microservices and real-time trading systems. * Optimize resource utilization and drive cost-effective scaling strategies to support peak trading volumes while maintaining ultra low latency. * CI/CD & Automation: * Lead the development and optimization of CI/CD pipelines for rapid, reliable deployments of trading platform components, smart contracts, and backend services. * Champion Infrastructure-as-Code (IaC) using tools like Terraform, CloudFormation, and Helm, ensuring reproducibility and automation across environments. * Implement GitOps practices to streamline code-to-deployment cycles and minimize latency introduced by deployment processes. * Ultra Low Latency Systems: * Collaborate with system architects to design infrastructure with an emphasis on reducing end-to-end latency from API gateways to internal microservices and database layers. * Work on optimizing network configurations, using technologies like AWS Global Accelerator, low-latency load balancers, and custom routing protocols to achieve near real-time data processing. * Ensure that system monitoring includes detailed latency tracking, and drive improvements using techniques such as micro-caching, optimized container orchestration, and hardware acceleration where applicable. * Security & Compliance: * Establish and enforce robust security best practices across the trading platform, including network security, container hardening, and data encryption. * Collaborate with the security team to maintain compliance with regulatory standards relevant to the crypto industry. * Ensure that critical systems, such as trading engines and wallet services, are secured against DDoS, API abuse, and other potential threats. * Monitoring, Incident Response & Reliability: * Develop and maintain end-to-end observability using Prometheus, Grafana, ELK, Datadog, and other monitoring tools. * Set up SLOs/SLIs and alerting systems to proactively monitor system performance including ultra low latency metrics and respond to incidents, ensuring minimal downtime during critical trading periods. * Lead post-mortem analysis and drive continuous improvement in incident management processes. * Team Leadership & Collaboration: * Mentor and lead a high-performing team of DevOps engineers, fostering a culture of continuous improvement, innovation, and accountability. * Work closely with cross-functional teams including engineering, product, and security to align infrastructure strategies with business goals. * Drive best practices for automation, scalability, and reliability across the organization. The CoinDCX journey: building tomorrow, today At CoinDCX, we believe CHANGE STARTS TOGETHER . You are the driving force that will help us make Web3 accessible to all. In the last six years, we have skyrocketed from being India s first crypto unicorn to carrying a community of over 125 million with us. To continue maximising the adoption and acceleration of Web3, we are now focused on developing cutting-edge products, addressing accessibility and security challenges, and bridging the gap between people and Web3 technologies. While we go ahead and keep dominating the Web3 world, we would like to HODL you on our team! Join our team of passionate innovators who are breaking barriers and building the future of Web3. Together, we will make the complex simple, the inaccessible accessible, and the impossible possible. Boost your innovation to an ALL TIME HIGH with us! You need to be a HODLer of these * 7+ years of hands-on experience in DevOps/SRE with a deep focus on Kubernetes cluster design, cloud-native application deployment, and ultra low latency systems. * Extensive experience with Kubernetes, container orchestration, and advanced networking (e.g., custom resource definitions, operators, service meshes). * Proven expertise in ultra low latency network design, including direct cloud interconnects, low-latency load balancing, and optimization of data paths. * In-depth understanding of security best practices in cloud environments, including container security, encryption, and access control. * Proficiency with CI/CD, GitOps, and Infrastructure-as-Code tools (Terraform, CloudFormation, Pulumi). * Strong programming and scripting skills (Python, Go) for automation and tooling. * Proven experience in senior or leadership roles, mentoring and guiding teams through complex technical challenges. * Excellent problem-solving, analytical, and communication skills, with the ability to drive consensus across diverse teams. * Experience in architecting solutions that balance ultra low latency performance with robust security and operational stability. * A strong background in building and managing cloud-native, high-performance systems in demanding environments. * Prior experience in industries that require ultra low latency systems (e.g., finance, trading) is highly desirable You will be mining through these tasks * Kubernetes Architecture & Operations: * Design, deploy, and manage high-availability, scalable Kubernetes clusters (EKS, GKE, or AKS) powering production-grade applications. * Optimize cluster performance with advanced scheduling, resource management, and autoscaling techniques. * Drive best practices for container orchestration, network policies, persistent storage, and service mesh integration (Istio/Linkerd). * Open Application Model (OAM) Implementation: * Champion and implement OAM to standardize and simplify the deployment of cloud-native applications in a declarative, platform-agnostic manner. * Develop and refine OAM component and trait definitions that support rapid application updates and portability. * Integrate OAM with GitOps workflows and CI/CD pipelines to enable seamless, automated deployments. * Ultra Low Latency Network & Systems Design: * Architect and optimize ultra low latency network topologies for data centers and cloud infrastructures, focusing on minimizing network hops, optimizing routing paths, and leveraging specialized load balancing solutions. * Collaborate with network engineers to implement technologies such as AWS Global Accelerator, low-latency load balancers, direct connect solutions. * Design systems that prioritize real-time data processing and response times, ensuring that microservices, APIs, and data pipelines meet ultra low latency requirements. * Evaluate and integrate hardware accelerators and specialized networking protocols when needed to achieve minimal latency. * Security Best Practices & Compliance: * Implement and enforce robust security measures across the entire infrastructure, including container and network security best practices, encryption (in transit and at rest), and secure configuration management. * Develop and maintain strict access control policies using RBAC, network segmentation, and automated compliance checks. * Collaborate with security teams to conduct regular vulnerability assessments, penetration tests, and audits, ensuring adherence to industry standards and regulatory requirements. * Integrate security into the CI/CD pipeline (DevSecOps) to identify and remediate risks early in the development lifecycle. * CI/CD, GitOps & Infrastructure Automation: * Lead the design, development, and optimization of CI/CD pipelines using Kubernetes-native tools (ArgoCD, GitHub Actions) to ensure rapid, reliable deployments. * Drive Infrastructure-as-Code (IaC) initiatives using Terraform, CloudFormation, and Pulumi, ensuring consistent, automated, and reproducible infrastructure deployments. * Advocate and implement GitOps best practices to manage Kubernetes configurations and application deployments. * Observability, Monitoring & Incident Response: * Develop comprehensive monitoring, logging, and alerting systems (using Prometheus, Grafana, ELK, Datadog, etc.) that provide deep insights into system performance, including detailed latency metrics. * Establish and refine SLOs/SLIs for ultra low latency performance, and drive proactive incident management and post-mortem analyses. * Continuously analyze system performance data to identify bottlenecks and implement improvements that enhance overall responsiveness. * Technical Leadership & Mentorship: * Serve as a subject matter expert and mentor for Kubernetes, OAM, ultra low latency network design, and security best practices, sharing knowledge across the organization. * Lead technical design reviews, drive innovation, and evaluate emerging cloud-native and network technologies. * Collaborate with cross-functional teams including development, security, and product to align infrastructure strategy with business objectives. Are you the one? Our missing block * You are knowledge-hungry when it comes to VDA and Web3, always eager to dive deeper and stay ahead in this evolving space. * The world of Web3 and VDA excites you, fueling your curiosity and driving you to explore new opportunities within this dynamic landscape. * You act like an owner, constantly striving for excellence, impact, and tangible results in everything you do. * You embrace a We over Me mindset, growing individually while fostering the growth of those around you. * Change is your catalyst, igniting your passion to build and innovate. * You think outside the box, unbound by limitations or doubt, always pushing the boundaries of what s possible. Perks That Empower You Our benefits are designed to make a lasting impact on your life, giving you the freedom to create a work-life balance that truly suits you. * Design Your Own Benefit: Tailor your perk package to fit your unique needs. Whether you re eyeing a new gadget or welcoming a furry friend into your life, our flexible benefits ensure that you can prioritize what matters most to you. * Unlimited Wellness Leaves: We believe in the power of well-being. Take the time you need to recharge, knowing that your health is our priority. With unlimited wellness leaves, you can return refreshed, ready to build and grow. * Mental Wellness Support: Your mental health is as important as your professional growth. Benefit from access to health experts, free counseling sessions, monthly wellness workshops, and regular team outings, all designed to help you stay balanced and connected. * Bi-Weekly Learning Sessions: These sessions are more than just updates they re opportunities to fuel your growth. Stay ahead with the latest industry knowledge, sharpen your skills, and accelerate your career in an ever-evolving landscape.
Employment Type: Full Time, Permanent
Read full job descriptionPrepare for Engineering Manager roles with real interview advice
Good colleagues who are passionate to build
The changes in top management caused a lot of issues