Monitoring and Incident Management (fix, Notification and Escalation) of the Production Environment Monitoring of Systems and Applications within ICE Data Services production environment
Diagnose and fix (based on run-book and knowledge) Incidents raised through Monitoring Tools, Production Conference Bridges and chats
Work with, and escalate to, other internal and external teams to implement Incident fixes, work-around and data recovery
Open, and update production Incident tickets according to company standards
Problem Management Investigate and update incident tickets with root cause and incident description, ensuring appropriate corrective actions follow-up tickets are assigned
Management incident tickets through to closure, ensuring incident details are complete and accurate, and all corrective actions have been completed
System and Application Production Readiness Work with other internal and external teams to expand and maintain operational run-books and other documentation
Application and infrastructure availability checks and tasks at scheduled times
Configuration of Monitoring Tools and alarms
Deployment Management Production deployments Approve and execute production deployment tasks
Miscellaneous Participate in disaster recovery, business continuity and workplace recovery events (practice and real life)
Participate in continuous improvement programs, such as trend analysis of recurring issues
Provide and report on performance metrics of the environment
Follow the Handover process documented to bring the next shift up to speed and highlight priority items or issues
Responsibilities
Monitoring of systems and applications within the ICE Data services business
Analyze and resolve the incidents within a stipulated timeline
Perform deployments during the regular maintenance windows
Closely work with Development & QA teams on root cause tasks
Knowledge and Experience IT based University degree 0-3 years experience within IT systems support and/or operational support of applications, databases within both Windows and Linux/Unix OS environments
Proficiency in Bash (Linux scripting) and working knowledge of a broad range of Linux and Windows core utilities/applications, Working knowledge of networking: specifically TCP and UDP Willingness to work shifts as the company requires and to be contacted after hours to help resolve issues where required
Candidates must be able to work unsocial hours and days, including weekends and bank holidays
Strong skills in the following areas Business English verbal and written communications skills High level of general IT skills with email and MS Office Applications Logical approach, critical thinking, and analytical problem-solving skills with an ability to identify the root cause(s) of a problem
Works as a team player - able to interface / liaison effectively with a variety of different kinds of IT and non IT contributors and personalities across the organization
Maintains effective relationships with individuals and the team as a whole
Able to build team morale and motivates individual members of the team effectively by recognizing their contribution
Shares own knowledge and encourage team members to share their knowledge
Identifies opportunities to encourage positive changes within operations
Achieve a high level of performance from self and team
Ability to be organized and decisive while under pressure, when managing urgent and critical production issues as and when they occur
Excellent time management skills - A structured and organized approach to their work and knows how to manage priorities and able to multi-task
Self-confident and assertive
Schedule This role offers work from home flexibility of one day per week