Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 1K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

RATE NOW!
- ABECA 2025
  
  RATE NOW!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

Recro

Compare

4.2

based on 34 Reviews

40 Recro Jobs

Recro.io - Lead Data Engineer - ETL/PySpark (8-12 yrs)

Recro

4.2

based on 34 Reviews

8-12 years

Recro

posted 9d ago

Job Role Insights

Flexible timing

Key skills for the job

Data Engineering Python SQL ETL Testing Big Data Hadoop Administration

+ 2 more

Job Description

Job Description :

We are looking for an experienced Lead Data Engineer to join our dynamic team and help us build innovative data engineering solutions that empower businesses to leverage the full potential of their data.

As a Lead Data Engineer, you will be responsible for building scalable data pipelines, managing large datasets, and designing end-to-end data architectures to derive actionable insights from terabyte-scale data.

Key Responsibilities :

- Build scalable data engineering solutions to digitize and derive insights from unused or underutilized data sources.

- Develop robust ETL processes (Extract, Transform, Load) that efficiently handle and transform large datasets, integrating them into a centralized data lake or warehouse.

- Create BI streaming pipelines to handle real-time data processing and provide actionable insights across business functions.

- Design and implement data solutions for terabyte-scale datasets, ensuring high performance, scalability, and reliability.

- Utilize cloud platforms such as Azure (Data Lakes, Data Factory, Databricks) and AWS (Snowflake) to architect cloud-based data solutions.

- Work with Big Data technologies such as Hadoop, PySpark, and Kafka to process large-scale data and ensure effective data storage and access.

- Manage end-to-end deployment of data pipelines and infrastructure using CI/CD pipelines, such as Jenkins, to streamline development, testing, and production deployment.

- Ensure automated testing, monitoring, and troubleshooting of data pipelines to guarantee continuous data flow and operational stability.

- Lead data engineering projects from inception to delivery, ensuring they meet business requirements, performance standards, and timelines.

- Work closely with international clients to understand their data requirements, deliver custom solutions, and provide expert advice on best practices.

- Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to design and implement data solutions.

- Mentor junior and mid-level engineers, helping them to improve their technical skills and grow professionally within the company.

- Drive adoption of best practices in data engineering, including data governance, security, and compliance with industry standards.

- Foster a culture of continuous learning and innovation within the data engineering team.

Required Skills & Qualifications :

- Expertise in Azure (especially Azure Data Lake, Data Factory, Databricks) and/or AWS (particularly Snowflake).

- Hands-on experience in building cloud-based data architectures and scalable data pipelines.

- Proficiency in Hadoop, Kafka, PySpark, and SQL to process and manipulate large datasets.

- Strong experience working with data lakes, data warehouses, and real-time data streaming.

- Strong programming skills in Python and PySpark for data manipulation and transformation.

- Extensive experience writing optimized SQL queries for complex data operations.

- 8-12 years of experience in Data Engineering with a focus on Big Data and cloud solutions.

- Proven ability to lead teams and manage end-to-end project delivery while working with cross-functional teams and international clients.

- Experience with CI/CD pipelines, particularly in deploying and managing data engineering solutions using tools like Jenkins.

- Strong understanding of data architecture, ETL processes, data lakes, and data warehousing concepts.

- Ability to design solutions for both batch and streaming data