The Engineer will be responsible for deploying and monitoring the AWS Infrastructure based on the customer requirement to deliver the analytical solutions. This role will work closely with the Managers, Clients and Managed Services teams.
Develop cloud data services provisioning automation with integrated capabilities of IAM, network, security policies as code, and observability.
Implement built-in resiliency, and observability, and enable FinOps as a part of infrastructure automation to enable cloud IaaS and PaaS services
Troubleshoot and diagnose issues and remediate them accordingly.
Deploy, configure, and manage AWS infrastructure using Infrastructure as Code (Terraform, CloudFormation).
Estimating AWS usage costs and identifying operational cost control mechanisms
Delivery responsibilities in the areas of cloud network administration, security administration, instantiation, provisioning, optimizing the environment, and third-party software support.
Conduct thorough troubleshooting to identify and resolve issues affecting infrastructure components, applications, or services.
Conduct routine health checks on servers, networks, and infrastructure components to identify potential problems before they escalate.
Provide support to end-users for infrastructure-related issues, including access problems, connectivity issues, and basic troubleshooting.
Participate in on-call and rotation shifts to provide 24x7 coverage
Respond promptly to alerts incidents, and resolve to closure
Thorough knowledge of Hadoop and its ecosystems overall architecture.
Monitor resource utilization and participate in capacity planning discussions to ensure optimal infrastructure performance.
Partner with developers and internal stakeholders to understand requirements, engineer, optimize, support, and maintain cloud-native solutions.
Lead initiatives to ensure the utmost reliability, scalability, and performance of our server, storage, and cloud platforms, employing thorough performance management strategies to identify and proactively address potential issues.
Participate in and often lead comprehensive project assignments, ensuring meticulous attention to detail in maintenance, troubleshooting, and continuous improvement initiatives for optimal company operations.
Maintain scrupulous documentation for system changes, automation processes, and platform configurations, strictly adhering to the companys change management policies.
Excel in collaboration and communication with diverse stakeholder groups, breaking down complex technical topics into clear, understandable insights and advocating for infrastructure needs and priorities
Monitor and remediate vulnerabilities, ensuring OS and container security updates.