Data Engineer
The ideal candidates will have 5+ years of experience in Data Engineering, with a strong focus on Python and SQL programming. The role requires proficiency in leveraging AWS services to build efficient, cost-effective datasets that support Business Reporting and AI/ML Exploration.
Candidates must demonstrate the ability to functionally understand the Client Requirements and deliver Optimized Datasets for multiple Downstream Applications.
The selected individuals will work under the guidance of an Lead from Onsite and closely with Client Stakeholders to meet business objectives.
Key Responsibilities
Cloud Infrastructure:Design and implement scalable, cost-effective data pipelines on the AWS platform using services like S3, Athena, Glue, RDS, etc.Manage and optimize data storage strategies for efficient retrieval and integration with other applications.Support the ingestion and transformation of large datasets for reporting and analytics.
Tooling and Automation:Develop and maintain automation scripts using Python to streamline data processing workflows.Integrate tools and frameworks like PySpark to optimize performance and resource utilization.Implement monitoring and error-handling mechanisms to ensure reliability and scalability.
Collaboration and Communication:Work closely with the onsite lead and client teams to gather and understand functional requirements.Collaborate with business stakeholders and the Data Science team to provide datasets suitable for reporting and AI/ML exploration.Document processes, provide regular updates, and ensure transparency in deliverables.
Data Analysis and Reporting:Optimize AWS service utilization to maintain cost-efficiency while meeting performance requirements.Provide insights on data usage trends and support the development of reporting dashboards for cloud costs.
Security and Compliance:Ensure secure handling of sensitive data with encryption (e.g., AES-256, TLS) and role-based access control using AWS IAM.Maintain compliance with organizational and industry regulations.
Required Skills:5+ years of experience in Data Engineering with a strong emphasis on AWS platforms.Hands-on expertise with AWS services such as S3, Glue, Athena, RDS, etc.Proficiency in Python and SQL for building Data Pipelines for ingesting data and integrating it across applications.Demonstrated ability to design and develop scalable Data Pipelines and Workflows.Strong problem-solving skills and the ability to troubleshoot complex data issues.
Preferred Skills:Experience with Big Data technologies, including Spark, Kafka, and Scala, for Distributed Data processing.Hands-on expertise in working with AWS Big Data services such as EMR, DynamoDB, Athena, Glue, and MSK (Managed Streaming for Kafka).Familiarity with on-premises Big Data platforms and tools for Data Processing and Streaming.Proficiency in scheduling data workflows using Apache Airflow or similar orchestration tools like One Automation, Control-M, etc.Strong understanding of DevOps practices, including CI/CD pipelines and Automation Tools.Prior experience in the Telecommunications Domain, with a focus on large-scale data systems and workflows.AWS certifications (e.g., Solutions Architect, Data Analytics Specialty) are a plus.
Pune
2
NA