12 Metromax Group Jobs
Data Engineer - AWS Glue/EMR (3-8 yrs)
Metromax Group
posted 1mon ago
Fixed timing
Key skills for the job
Key Responsibilities :
- Design and develop data pipelines using AWS Glue, EMR, Spark Scala, and S3 to support both batch and real-time data processing needs.
- Implement ETL processes to extract, transform, and load data from various sources (structured and unstructured) into the data lake.
- Leverage Apache Spark on EMR for big data processing and transformations, using Spark Scala
- Manage and optimize data storage on S3, ensuring proper data partitioning, file formats (Parquet, ORC, Avro), and lifecycle policies for cost-effective storage solutions.
- Monitor, troubleshoot, and optimize EMR clusters for performance, scalability, and cost efficiency.
- Collaborate with data architect and analysts to ensure seamless data integration and support advanced analytics and machine learning models.
- Automate data workflows using AWS Step Functions, Lambda, and other orchestration tools.
Required Qualifications :
- Bachelor's or master's degree in computer science, Data Engineering, or a related technical field.
- 3-5 years of experience in data engineering, particularly using AWS services (EMR, Glue, S3, Lambda).
- Strong expertise in Apache Spark for distributed data processing, with hands-on coding experience in Scala and Python.
- Experience with building ETL pipelines and working with big data in a cloud-based Lakehouse environment.
- Deep understanding of data formats (Parquet, Avro, ORC) and file optimization techniques.
- Familiarity with data modeling principles, including partitioning, bucketing, and schema management in AWS Glue Data Catalog.
- Strong knowledge of SQL and query optimization for working with large datasets.
- Experience with AWS security services such as IAM, KMS (Key Management Service), and encryption best practices.
- Proficiency in troubleshooting and performance tuning of Spark and EMR clusters for large-scale data processing.
- Familiarity with CI/CD pipelines and infrastructure-as-code (Terraform, CloudFormation) for managing AWS environments.
Preferred Qualifications :
- AWS Certified Data Analytics, Developer, or Solutions Architect certification.
- Experience with streaming data technologies such as Kinesis or Kafka for real-time data ingestion.
- Knowledge of serverless computing and experience with AWS Lambda, Step Functions, and DynamoDB.
- Familiarity with DevOps and automation tools (e.g., Jenkins, Git, Docker).
Soft Skills :
- Strong problem-solving and analytical thinking skills.
- Ability to work collaboratively in a fast-paced, cross-functional environment.
- Excellent communication skills to explain complex technical issues to both technical and non-technical stakeholders.
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Data Engineer roles with real interview advice
4-8 Yrs
2-5 Yrs
Hyderabad / Secunderabad
3-8 Yrs
Bangalore / Bengaluru
3-12 Yrs
Bangalore / Bengaluru
4-10 Yrs
Bangalore / Bengaluru
3-8 Yrs
Bangalore / Bengaluru
3-8 Yrs
Bangalore / Bengaluru