34 Neerinfo Solutions Jobs
AWS Data Engineer - ETL (4-8 yrs)
Neerinfo Solutions
posted 2mon ago
Flexible timing
Key skills for the job
Key Responsibilities :
AWS Data Infrastructure Development :
- Design, develop, and maintain AWS-based data pipelines using services such as AWS Glue, AWS Redshift, Amazon S3, and AWS Lambda.
- Build ETL (Extract, Transform, Load) processes, integrating batch and near real-time data from various sources to Amazon Redshift, S3, or other cloud-based storage solutions.
- Develop data transformation scripts using Python and PySpark to process large datasets in the cloud.
Big Data Technologies :
- Work with Apache Spark and the Hadoop ecosystem to manage large-scale data processing workloads.
- Optimize Apache Spark and Hadoop jobs for performance and scalability, ensuring data pipelines run efficiently at scale.
SQL and Database Optimization :
- Write and optimize complex SQL queries for data manipulation and aggregation in cloud data warehouses.
- Experience in AWS Redshift for OLAP workloads, Hive for big data processing, or similar data warehouse solutions.
Cloud and Data Security :
- Implement security measures to ensure the protection and privacy of sensitive data within the AWS ecosystem, following industry best practices.
- Work closely with data security teams to ensure compliance with data governance and regulatory requirements.
Scheduling and Automation :
- Experience with scheduling tools like Apache Airflow for workflow automation and pipeline orchestration.
- Set up and maintain automated pipelines, monitor job performance, and manage job failures to ensure the continuity of data workflows.
Documentation and Best Practices :
- Ensure readable and maintainable documentation of data engineering components being developed for transparency, knowledge sharing, and onboarding.
- Follow coding best practices and standards for Python and other technologies used in the development of data pipelines.
Collaboration and Cross-Functional Teams :
- Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver solutions that enhance business decision-making.
- Participate in agile development processes, contributing to sprint planning and progress tracking.
Required Qualifications and Skills :
Experience :
- 4 to 8 years of experience in data engineering, with significant hands-on experience in AWS services and data pipeline development.
- Strong experience with AWS services including Redshift, Glue, EMR, Lambda, and S3.
- In-depth experience with Apache Spark and Hadoop ecosystem for distributed data processing and analysis.
- Proficiency in Python and PySpark for data engineering tasks, data transformation, and automation.
Technical Expertise :
- Strong understanding of SQL for data manipulation, performance tuning, and data retrieval from cloud-based data warehouses (AWS Redshift, Hive).
- Expertise in designing and developing ETL pipelines (batch and near real-time) for integrating data across different systems and platforms.
- Exposure to data scheduling tools like Apache Airflow for orchestrating and automating workflows.
- Hands-on experience with data security practices and the ability to implement security measures to protect data in AWS.
Cloud & Big Data Technologies :
- Advanced experience with AWS data services (Redshift, Glue, S3, Lambda, EMR) for big data analytics and storage solutions.
- Familiarity with Hadoop, Spark, and other big data processing frameworks.
Tools & Frameworks :
- Knowledge of version control systems like Git for managing codebases and collaborating with teams.
- Experience working in agile development environments, contributing to sprint planning and continuous delivery.
Problem-Solving & Troubleshooting :
- Excellent problem-solving and debugging skills, with the ability to resolve data pipeline issues and improve system performance.
- Ability to troubleshoot and resolve issues related to data pipelines, databases, and AWS infrastructure.
Soft Skills :
- Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams.
- Ability to document and communicate technical requirements and solutions clearly and concisely.
Preferred Skills :
- Experience with data lake architecture and integration of data across multiple sources.
- Certifications in AWS (e.g, AWS Certified Big Data - Specialty, AWS Certified Solutions Architect).
- Experience with containerization technologies such as Docker and orchestration tools like Kubernetes.
- Familiarity with Data Governance and Data Quality practices.
Education :
- Bachelor's or Master's degree in Computer Science, Information Technology, Data Engineering, or a related field
Functional Areas: Other
Read full job descriptionPrepare for AWS Data Engineer roles with real interview advice
12-14 Yrs