37 Elements Jobs
AWS Data Engineer - Spark/Python (10-12 yrs)
Elements
posted 18d ago
Flexible timing
Key skills for the job
Position Overview
We are seeking a skilled AWS Data Engineer with strong expertise in designing and optimizing scalable data pipelines and processing systems. The ideal candidate will have in-depth knowledge of Spark, PySpark, AWS Cloud services, and hands-on experience in data integration, transformation, and warehousing. The role involves collaborating with cross-functional teams to develop robust solutions for large-scale data challenges, enabling data-driven decision-making processes.
Key Responsibilities :
Data Pipeline Development :
- Design, develop, and maintain scalable data processing pipelines using Spark and PySpark.
- Optimize Spark jobs to enhance performance and efficiency in large-scale data environments.
Data Transformation :
- Write and manage complex SQL queries to manipulate, clean, and transform datasets.
- Develop and deploy data workflows to meet business needs.
AWS Cloud Services :
- Work extensively with AWS Cloud services such as Redshift, AWS Glue, and Databricks to manage and process large datasets.
- Utilize SQL Server for additional database operations.
Programming & Modularization :
- Use Python for data processing tasks, ensuring modular and reusable code packaging.
- Adhere to best practices for scalable and maintainable Python development.
Data Integration & Real-Time Streaming :
- Implement real-time data streaming and integration using tools like NiFi, Kafka, and EventHub (optional but desirable).
Data Warehousing :
- Work hands-on with Snowflake for managing data warehousing and analytics needs.
Collaborative Development :
- Partner with data scientists, analysts, and engineering teams to fulfill data requirements and support advanced analytics initiatives.
Informatica :
- Apply basic knowledge of Informatica for data integration tasks and workflow automation.
Required Skills and Qualifications :
- Spark Expertise : Advanced experience with Spark and PySpark, including optimization techniques.
- SQL Knowledge : Strong skills in writing and optimizing complex SQL queries for data transformation.
- Cloud Proficiency : Experience with AWS services (Redshift, Glue) and Databricks.
- Python Programming : Proficient in Python for scripting and data processing tasks.
- Real-Time Data Tools : Familiarity with tools like NiFi, Kafka, and EventHub is highly desirable.
- Snowflake Expertise : Hands-on experience with Snowflake for data warehousing and advanced analytics.
- Informatica : Basic understanding of Informatica tools and their application in data projects.
Preferred Skills (Optional) :
- Real-time data streaming experience using Kafka, NiFi, or similar tools.
- Familiarity with EventHub for managing event-driven data workflows.
- Experience in CI/CD pipelines and version control with Git.
Soft Skills :
- Excellent communication and collaboration abilities to work with diverse teams.
- Proactive and detail-oriented with the ability to take ownership of complex data challenges.
- Strong analytical and problem-solving skills.
Functional Areas: Other
Read full job description