Required Skills: - Container Management (microservices): Docker and Kubernetes provisioning, orchestration and clustering - Cloud Solutions: AWS (minimum of Associate level certification is a plus) - Configuration Management: Terraform scripts, Shell scripting. - Monitoring Tools: Prometheus/Grafana, Node Exporter, Nagios, Cloud Watch etc. - Ability to ensure smooth software deployment by writing script updates and running diagnostics and provide Level 2 technical support. - Proficient in Linux commands and fundamentals. - Experience implementing DevOps best practices and Security best practices. - ELK stack (Elasticsearch, Logstash, Kibana) / Loki. - Hands-on experience with databases including MySQL, Mongo & PostgreSQL. - Working experience with Scylla, Kafka and Clickhouse is a plus. - Deep understanding of Nginx and ability to write configurations based on various requirements. - Knowledge about various deployment strategies for Python, Node.js and Ruby applications. - Knowledge of various domain routing policies. - Good Knowledge in writing Lambda functions. Responsibilities:
- Perform root cause analysis of production errors and resolve technical issues. - Establish, maintain and evolve concepts in continuous integration and deployment (CI/CD) pipelines for existing and new services. - Lead incident response efforts, troubleshoot issues, and perform root cause analysis to prevent recurrence. - Implement security best practices, including automated security checks, vulnerability scanning and secure access controls. - Follow best practices such as test-driven development (TDD) and continuous integration (CI). - Secure, scale, and manage Linux virtual environments. - Identify systems that can benefit from automation, monitoring and infrastructure-as-code and develop and scale products and services accordingly. - Write clean, stable and safe code in short time frames and frequent increments. - Good troubleshooting skills, problem-solving skills and attention to detail.