18 RCS Groups Jobs
Data Engineer - ETL (4-10 yrs)
RCS Groups
posted 11hr ago
Key skills for the job
Key Responsibilities :
- Design, develop, and implement scalable and efficient ETL pipelines using PySpark on AWS EMR.
- Build, optimize, and maintain complex data workflows and data models with a focus on performance and scalability.
- Leverage functional programming concepts to write clean, reusable, and efficient code.
- Configure and manage AWS EMR clusters, ensuring optimal performance and cost-effectiveness.
- Integrate with AWS S3 for reading and writing data using Spark.
- Troubleshoot, debug, and optimize Spark jobs to ensure smooth execution.
- Utilize Spark UI and other monitoring tools to track, analyze, and resolve issues in Spark jobs.
- Collaborate with data engineers, analysts, and other stakeholders to ensure timely and accurate delivery of data.
Required Skills :
- 5+ years of experience in Python programming with strong proficiency in Python.
- 3+ years of hands-on experience in developing ETL pipelines using PySpark on AWS EMR.
- Strong understanding of Spark's DataFrame API and the inner workings of Spark.
- Proven experience configuring and managing AWS EMR clusters.
- In-depth knowledge of AWS S3 object storage and integration with Spark.
- Strong troubleshooting and debugging skills for Spark jobs.
- Familiarity with monitoring and optimizing Spark jobs using Spark UI.
Desired Skills :
- Experience with data modeling and functional programming concepts.
- Knowledge of other AWS services (e.g., AWS Lambda, AWS Glue, etc.) is a plus.
Functional Areas: Software/Testing/Networking
Read full job description8-18 Yrs