4-6 years of working experience in monitoring, scripting, system engineering and troubleshooting.
Solid grasp of Windows & Linux systems including networking concepts.
Experience with monitoring tools such as CloudWatch, DataDog, Sentry, TICK stack, ELK Stack, Prometheus, Grafana, etc.
Working knowledge of Docker, Kubernetes/EKS, Amazon AWS (VPC, EC2, S3, CloudFront, Route53, RDS, API Gateway, SES, SQS, Elastic Cache, Elastic Search, autoscaling etc.)
Working knowledge in emerging DevOps tools, methods and practices of Continuous Integration, Continuous Deployment, Infrastructure as Code, Configuration Management (Ex: shell, groovy scripting , Spinnaker, Packer, Ansible, Terraform, Kubernetes, Docker, Jenkins, GIT, bitbucket, GitHub and the Features branching workflow, etc)
Excellent Troubleshooting skills.
Have knowledge regarding dynamic tracing and continuous profiling.
Analyze performance of application and suggest solution on the basis of Analysis.
Understanding of NodeJS , python will be a plus.
Awareness of critical concepts in Security of Infrastructure and Application
Knowledge in Agile development environment
Keen sense of urgency and ownership over critical problem areas.
Ability to work in a global and distributed environment with agility to hold communication with different audiences.
You will have an opportunity to :
PARTNER with engineering & product teams to provide high available and reliable systems, while building best practices and standards.
WORK with the broader team to build and maintain high performance, flexible and highly scalable web, and mobile based applications
PERFORM technical root causes analysis and outlines corrective action for given problems
PARTICIPATE in a 24 7 rotation for production issue escalation, if needed.
PROVIDE reliable solutions to a variety of problems using sound problem solving techniques
Maintain business continuity by driving the opportunity of making systems highly resilient.
ACHIEVE engineering excellence by implementing standard practices and standards