159 Forward Eye Technologies Jobs
Data Engineer - ETL (8-14 yrs)
Forward Eye Technologies
posted 1mon ago
Fixed timing
Key skills for the job
Job Title : Data Engineer (PySpark, AWS, SQL)
Job Description :
We are looking for a skilled Data Engineer with expertise in PySpark, AWS, and SQL to support data processing and analytical initiatives. This role involves working closely with data engineering and data science teams to build, maintain, and optimize large-scale data pipelines and integrations on AWS. The ideal candidate will be proficient in ETL processes using PySpark and SQL, with a deep understanding of cloud data infrastructure, specifically within the AWS ecosystem.
Key Responsibilities :
Data Pipeline Development :
- Design, build, and optimize ETL pipelines using PySpark for data ingestion, transformation, and storage on AWS.
- Collaborate with stakeholders to understand data requirements, translating them into scalable data solutions.
Cloud Infrastructure Management :
- Develop and manage AWS services such as S3, Glue, Lambda, EMR, Redshift, and RDS for data processing and storage.
- Implement data workflows that handle both batch and real-time processing needs, ensuring low-latency and efficient data access.
Database Management :
- Write and optimize complex SQL queries for data extraction and transformation from AWS RDS, Redshift, and other SQL-based databases.
- Leverage indexing, partitioning, and caching techniques to improve query performance on large datasets.
Data Quality and Governance :
- Ensure data accuracy, completeness, and consistency throughout the data lifecycle.
- Implement best practices for data quality, governance, and security using AWS-native tools and third-party solutions.
Automation and Optimization :
- Automate repetitive tasks and optimize workflows, ensuring efficient and resilient data processing.
- Use CI/CD pipelines and version control (e.g., Git) to deploy and manage data workflows.
Troubleshooting and Support :
- Identify and resolve issues within the data pipeline, providing solutions for data access and query performance.
- Support the team in data analytics and data science initiatives by preparing data in an accessible and structured format.
Required Skills and Experience :
- 5+ years of data engineering experience with a focus on PySpark, AWS, and SQL.
- Strong proficiency in PySpark for ETL and data transformation tasks.
- Hands-on experience with AWS data services (S3, Glue, Redshift, EMR, Lambda).
- Expertise in SQL and relational database management, with experience optimizing complex queries.
- Experience in data pipeline orchestration and monitoring (Airflow or similar).
- Knowledge of data partitioning, indexing, and caching techniques to enhance performance.
- Familiarity with CI/CD principles and tools, including version control systems like Git.
Nice to Have :
- Experience with AWS Redshift Spectrum and Athena for querying data on S3.
- Knowledge of streaming data processes and tools like Kinesis or Kafka.
- Background in big data technologies, such as Hadoop or Apache Spark.
- Experience with data lake architectures and building data marts for analytics.
Functional Areas: Software/Testing/Networking
Read full job description5-10 Yrs
Mumbai, Delhi/Ncr, Bangalore / Bengaluru
7-12 Yrs
Mumbai, Delhi/Ncr, Bangalore / Bengaluru
10-16 Yrs
Mumbai, Delhi/Ncr, Bangalore / Bengaluru