Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 1K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

Blue Ridge

Compare

2.5

based on 11 Reviews

4 Blue Ridge Jobs

Senior Data Engineer

Blueridge Global

2.5

based on 11 Reviews

5-10 years

Pune

4 vacancies

Senior Data Engineer

Blue Ridge

posted 6hr ago

Job Role Insights

Fixed timing

Key skills for the job

Data Bricks Pyspark Apache Nifi Airflow

Job Description

POSITION SENIOR DATA ENGINEER

Are you passionate about building state-of-the-art data platforms and powering the next generation of compute
and AI applications, we'd love to hear from you. This is an exciting opportunity to leverage your expertise in
distributed computing frameworks to make a significant impact as we push the boundaries of supply chain planning
and eventually adoption of Gen AI.

JOB BRIEF
We are seeking an experienced Data Engineer to join our team and lead the development of a cutting-edge data
platform. The platform will leverage distributed computing frameworks such as Apache Spark, Databricks, and
Snowflake to enable near real time supply chain planning, eventually leading to advanced analytics, insights into
data with the adoption of Generative AI (GenAI) technologies across our product base.

KEY RESPONSIBILITIES
As a Lead Data Engineer, the candidate would be responsible for:

• Design and build a highly scalable, fault-tolerant data platform optimized for distributed computing and
large-scale data processing.
• Implement data pipelines and ETL/ELT processes using distributed computing frameworks to efficiently
ingest, transform, and load massive datasets from various sources.
• Leverage cloud data platforms to enable seamless data sharing, near-zero maintenance, and fast analytics
on structured and semi-structured data.
• Collaborate with data scientists, machine learning engineers, and software developers to understand data
requirements and build solutions to power GenAI applications.
• Optimize distributed computing jobs and queries for maximum performance and cost efficiency.
• Implement data governance, security, and compliance best practices.
• Provide guidance on distributed computing architecture and mentor junior data engineers.

QUALIFICATIONS

• 5+ years of experience as a Data Engineer for building of large-scale data pipelines using big data
technologies (Apache Spark/Kafka/Flink/Storm/Airflow/Hadoop/Map Reduce/Redshift/Presto).
• Strong proficiency in SQL, object-oriented programming experience in python and data modelling
techniques.
• Deep expertise in distributed computing principles and frameworks (e.g., Apache Spark), including SQL,
streaming, and optimizing jobs for scale and efficiency.
• Hands-on experience with developing and deploying distributed computing applications using cloud-based
platforms (e.g., AWS EMR, Azure HDInsight, GCP DataProc or equivalent).
• Strong understanding of cloud data platform architectures and best practices for ELT/ETL and Data
warehousing, data sharing, and query optimization (e.g., AWS Redshift/Athena, AWS Glue, Azure Synapse
Analytics, or equivalent).
• Experience enabling application engineers to build applications leveraging the data platform through APIs
and abstractions.
• Experience with orchestration frameworks like Apache Airflow and data streaming technologies like Kafka, Flink and Apache Storm.
• Knowledge of Datalake and Lakehouse concepts.
• Experience building and optimizing data pipelines for machine learning applications.
• Good knowledge on performance tuning and troubleshooting of batch and streaming jobs.
• Knowledge of data modelling, data warehousing, and schema design.
• Familiarity with public cloud platforms such as AWS, Azure, or GCP.
• Strong computer science fundamentals in data structures and algorithms.
• Good understanding of metadata driven development.
• Excellent problem-solving and communication skills.
• Bachelor's or Master's degree in Computer Science (Preferred), Engineering, or a related field