Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 1K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

RATE NOW!
- ABECA 2025
  
  RATE NOW!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

Steer Lean Consulting

Compare

4.4

based on 3 Reviews

1 Steer Lean Consulting Senior Data Engineer Job

Senior Data Engineer - Azure Databricks/PySpark (5-7 yrs)

Steerlean Consulting Services Pvt Ltd.

4.4

based on 3 Reviews

5-7 years

Gurgaon / Gurugram

Senior Data Engineer - Azure Databricks/PySpark (5-7 yrs)

Steer Lean Consulting

posted 1mon ago

Job Role Insights

Flexible timing

Key skills for the job

Data Engineering Python AWS SQL Clinical Data Management ETL Testing

+ 5 more

Job Description

Job Description :

We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation team. In this role, you will be instrumental in designing, developing, and maintaining ETL processes to ensure efficient extraction, transformation, and loading of data from various sources into data lake and data warehouse.

You will work closely with data engineers, data scientists, and business intelligence teams to build and optimize data workflows that support the project's analytics and reporting needs.

Must-Have Skills :

- AWS Glue (Crawler, Data Catalog).

- Python/Pyspark.

- Cloud formation/Terraform.

Good-to-Have Skills :

- Experience with Snowflake and its architecture (internal/external tables, stages, masking policies).

- Knowledge of AWS Services like SNS, S3, Lambda, Secret Manager, and Athena.

- Familiarity with Jira, GitHub, and Agile methodology.

Key Responsibilities :

1. ETL Development :

- Design and develop ETL processes using Databricks PySpark to extract, transform, and load data from heterogeneous sources into our data lake and data warehouse.

- Optimize ETL workflows for performance and scalability, leveraging Databricks PySpark and Spark SQL to efficiently process large data volumes.

- Implement robust error handling and monitoring mechanisms to proactively detect and resolve issues within ETL processes.

- Design and implement data solutions following the Medallion Architecture principles, organizing data into Bronze, Silver, and Gold layers.

- Ensure data is appropriately cleansed, enriched, and optimized at each stage to support robust analytics and reporting.

2. Data Pipeline Management :

- Hands On experience in creating advanced data pipelines using databricks workflows Develop and maintain data pipelines using Databricks PySpark, ensuring data quality, integrity, and reliability throughout the ETL lifecycle.

- Collaborate with data engineering, data science, and business intelligence teams to translate data requirements into efficient ETL workflows and pipelines.

3. Data Analysis and Query Optimization :

- Write and optimize complex SQL queries for data manipulation, aggregation, and analysis within Databricks PySpark applications.

4. Project Coordination and Continuous Improvement :

- Participate in project planning and coordination activities to ensure timely delivery of ETL solutions.

- Stay updated on the latest developments in Databricks PySpark, Spark SQL, and related technologies, recommending and implementing best practices and optimizations.

- Document ETL processes, data lineage, and metadata to facilitate knowledge sharing and ensure compliance with data governance standards.

5. Cloud Platform Expertise :

- Utilize cloud platforms (e.g. , AWS, Azure, or GCP) to design and deploy scalable and reliable SaaS solutions.

- Optimize infrastructure for performance, security, and cost efficiency

Functional Areas: Software/Testing/Networking

Read full job description