i
CloudifyOps
3 CloudifyOps Jobs
Cloud Engineer - Incident Management (3-5 yrs)
CloudifyOps
posted 2mon ago
Flexible timing
Key skills for the job
Skills :
- 3+ years of hands-on experience as a Cloud Engineer.
- Experience monitoring a 24/7 SaaS with I&P management proper understanding.
- Basic Scripting experience to troubleshoot.
- Deep understanding of manual data analysis, reporting.
- Deep understanding of monitoring tools to troubleshoot and find root cause, experience with Datadog.
- Good knowledge of at least one operating system: Linux, Unix, Solaris, Ubuntu, Windows.
- Preference is any LINUX flavored operating system (eg: Ubuntu or Red Hat).
- Basic understanding of networking: TCP/IP, IP addresses, HTTP, DNS, VPN.
- Especially cloud networking.
- Good understanding of container technology (Docker) and Container orchestration technology (Kubernetes) to do basic troubleshooting.
- Experience in Debugging production level disruptions using traces, metrics and dumps.
- Good hands-on experience in Incident and problem management process and tools around it like Jira, Mattermost, Confluence etc.
- Very good communication skills.
Job Responsibilities :
- Monitor APM and infra for the whole system in a 24/7 model.
- Develop automation to generate reports.
- Provide first level analysis using debugging traces and dumps present in APM tools like datadog.
- Monitor quality of I&P with proper guidance and work in collaboration of DevOps teams.
- Prepare and present reports around performance metrics.
- Lead meetings for Incident and problem management.
- Analyze historical data and provide insight to do problem clustering.
Functional Areas: Other
Read full job descriptionPrepare for Incident Manager roles with real interview advice
3-5 Yrs
Bangalore / Bengaluru