40 Recro Jobs
Recro.io - Lead Data Engineer - ETL/PySpark (8-12 yrs)
Recro
posted 9d ago
Flexible timing
Key skills for the job
Job Description :
We are looking for an experienced Lead Data Engineer to join our dynamic team and help us build innovative data engineering solutions that empower businesses to leverage the full potential of their data.
As a Lead Data Engineer, you will be responsible for building scalable data pipelines, managing large datasets, and designing end-to-end data architectures to derive actionable insights from terabyte-scale data.
Key Responsibilities :
- Build scalable data engineering solutions to digitize and derive insights from unused or underutilized data sources.
- Develop robust ETL processes (Extract, Transform, Load) that efficiently handle and transform large datasets, integrating them into a centralized data lake or warehouse.
- Create BI streaming pipelines to handle real-time data processing and provide actionable insights across business functions.
- Design and implement data solutions for terabyte-scale datasets, ensuring high performance, scalability, and reliability.
- Utilize cloud platforms such as Azure (Data Lakes, Data Factory, Databricks) and AWS (Snowflake) to architect cloud-based data solutions.
- Work with Big Data technologies such as Hadoop, PySpark, and Kafka to process large-scale data and ensure effective data storage and access.
- Manage end-to-end deployment of data pipelines and infrastructure using CI/CD pipelines, such as Jenkins, to streamline development, testing, and production deployment.
- Ensure automated testing, monitoring, and troubleshooting of data pipelines to guarantee continuous data flow and operational stability.
- Lead data engineering projects from inception to delivery, ensuring they meet business requirements, performance standards, and timelines.
- Work closely with international clients to understand their data requirements, deliver custom solutions, and provide expert advice on best practices.
- Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to design and implement data solutions.
- Mentor junior and mid-level engineers, helping them to improve their technical skills and grow professionally within the company.
- Drive adoption of best practices in data engineering, including data governance, security, and compliance with industry standards.
- Foster a culture of continuous learning and innovation within the data engineering team.
Required Skills & Qualifications :
- Expertise in Azure (especially Azure Data Lake, Data Factory, Databricks) and/or AWS (particularly Snowflake).
- Hands-on experience in building cloud-based data architectures and scalable data pipelines.
- Proficiency in Hadoop, Kafka, PySpark, and SQL to process and manipulate large datasets.
- Strong experience working with data lakes, data warehouses, and real-time data streaming.
- Strong programming skills in Python and PySpark for data manipulation and transformation.
- Extensive experience writing optimized SQL queries for complex data operations.
- 8-12 years of experience in Data Engineering with a focus on Big Data and cloud solutions.
- Proven ability to lead teams and manage end-to-end project delivery while working with cross-functional teams and international clients.
- Experience with CI/CD pipelines, particularly in deploying and managing data engineering solutions using tools like Jenkins.
- Strong understanding of data architecture, ETL processes, data lakes, and data warehousing concepts.
- Ability to design solutions for both batch and streaming data
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Lead Data Engineer roles with real interview advice
7-8 Yrs
4-9 Yrs
8-13 Yrs