Upload Button Icon Add office photos
filter salaries All Filters

5 TIH Foundation for IoT & IoE Jobs

Data Platform Engineer

3-6 years

Mumbai

1 vacancy

Data Platform Engineer

TIH Foundation for IoT & IoE

posted 3mon ago

Job Role Insights

Flexible timing

Job Description

In this role, you will build scalable data pipelines to ingest, transform, and prepare data from diverse sources—text, speech, images, and video—making it ready for Generative AI model training.

Your work will involve developing and managing the underlying platform while addressing challenges like governance, security, observability, lineage, and scalability.

The outcomes of your work will include efficient tools for data processing, a reliable data platform, and high-quality datasets tailored to the evolving needs of large-scale AI and LLM training.
Collaborating closely with researchers and ML engineers, you will play a pivotal role in enabling BharatGen to deliver state-of-the-art AI models, contributing to the advancement of India’s AI ecosystem through innovative data engineering solutions.

Key Responsibilities:
? Design and Build Scalable Platforms: Develop distributed infrastructure for ingesting, processing, and transforming diverse datasets (text, speech, images, video) at terabyte to petabyte scale.
? Develop Robust Data Pipelines: Create reliable, scalable pipelines to prepare datasets for Generative AI and LLM training.
? Implement Governance and Observability: Build frameworks for data lineage, monitoring, and access control to ensure data quality and operational reliability.
? Optimize Performance and Cost: Enhance platform performance and resource utilization using cost-effective strategies, including GPU-accelerated preprocessing.
? Collaborate and Innovate: Work closely with researchers and ML engineers to adapt platforms and data pipelines to evolving LLM requirements, addressing various data challenges.

? Drive Innovation: Stay updated on emerging tools, frameworks, and best practices to implement cutting-edge solutions for large-scale dataset creation.

Minimum Qualifications and Experience:
1. Education:
? Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
? [Preferred] Advanced degrees or certifications in Distributed Systems, Data Engineering, or Big Data technologies

2. Experience and Expertise:
? 3+ years of overall industry experience in engineering roles, demonstrating strong foundations in software
development, systems engineering, or related disciplines.
? 1+ years of specific hands-on experience in developing large-scale, distributed data pipelines and platforms,
preferably in high-performance AI or ML environments.
? Expertise in managing unstructured data (text, speech, or multimodal datasets) for high-performance use cases,
ideally in the context of LLM/AI datasets.
? Understanding of challenges in scalable data engineering, including ingestion, transformation, and storage
optimization for large-scale accelerated workflows.

Skills:
1. Technical:
? Proficiency in distributed systems and frameworks (e.g., Kafka, Ray, PySpark) for scalable data workflows.
? Exposure to end-to-end data lifecycle management, including DataOps.
? Strong programming skills in Python, Scala, or Go, with a focus on high-performance pipeline development.
? Experience with building and optimizing data pipelines, including ETL processes, data modeling, and integration
into scalable workflows.
? Expertise in data scraping, crawling frameworks, and modern dataset development techniques such as synthetic
data generation techniques.
? Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes).
? Deep understanding of data platform design, including data architecture, metadata tracking, data lineage,
observability, monitoring, and scalability best practices.
? Familiarity with Infrastructure-as-Code tools (e.g., Terraform, CloudFormation), CI/CD pipelines, relational/NoSQL
databases, and GPU-accelerated workflows.

? Familiarity with visualization and monitoring tools for lifecycle management and pipeline performance tracking.
2. Soft Skills:
? Adaptability and innovation in fast-paced, dynamic environments.
? Strong collaboration skills for interdisciplinary teamwork.
? Proactive problem-solving and a growth mindset to thrive in a mission-driven organization.

Other terms:
? The position is contractual, full time in nature and subject to periodic performance reviews


Employment Type: Full Time, Permanent

Read full job description

TIH Foundation for IoT & IoE Interview Questions & Tips

Prepare for TIH Foundation for IoT & IoE roles with real interview advice

What people at TIH Foundation for IoT & IoE are saying

What TIH Foundation for IoT & IoE employees are saying about work life

based on 10 employees
100%
100%
42%
Flexible timing
Monday to Friday
Within city
View more insights

TIH Foundation for IoT & IoE Benefits

Free Transport
Child care
Gymnasium
Cafeteria
Work From Home
Free Food +6 more
View more benefits

Compare TIH Foundation for IoT & IoE with

TCS

3.7
Compare

Accenture

3.8
Compare

Wipro

3.7
Compare

Cognizant

3.7
Compare

Capgemini

3.7
Compare

HDFC Bank

3.9
Compare

Infosys

3.6
Compare

ICICI Bank

4.0
Compare

HCLTech

3.5
Compare

Tech Mahindra

3.5
Compare

Genpact

3.8
Compare

Teleperformance

3.9
Compare

Concentrix Corporation

3.7
Compare

Axis Bank

3.7
Compare

Amazon

4.0
Compare

Jio

4.0
Compare

iEnergizer

4.6
Compare

Reliance Retail

3.9
Compare

IBM

4.0
Compare

LTIMindtree

3.7
Compare

Similar Jobs for you

Hindi Linguist at Mahendra Nextwealth IT India Pvt. Ltd.

Jaipur

2-6 Yrs

₹ 4-9 LPA

Linguist at Mahendra Nextwealth IT India Pvt. Ltd.

Jaipur

2-6 Yrs

₹ 4-9 LPA

Data Platform Engineer at Capgemini Technology Services India Limited

Bangalore / Bengaluru

3-7 Yrs

₹ 5-9 LPA

Data Platform Engineer at Capgemini Technology Services India Limited

Mumbai

4-8 Yrs

₹ 6-10 LPA

Digital at Media Designs

New Delhi

2-6 Yrs

₹ 4-8 LPA

Data Platform Engineer at Hdfc Bank

Thane, Bangalore / Bengaluru + 1

3-7 Yrs

₹ 6.5-16.5 LPA

Speech Language Pathologist at Ananthapuri Hospitals And Research Institute

Thiruvananthapuram

1-4 Yrs

₹ 6-10 LPA

Data Platform Engineer at Accenture Solutions Pvt Ltd

Bangalore / Bengaluru

7-12 Yrs

₹ 9-14 LPA

Data Platform Engineer at Accenture Solutions Pvt Ltd

Bangalore / Bengaluru

2-4 Yrs

₹ 9-13 LPA

Data Platform Engineer at Accenture Solutions Pvt Ltd

Bangalore / Bengaluru

2-4 Yrs

₹ 9-13 LPA

Data Platform Engineer

3-6 Yrs

Mumbai

3mon ago·via naukri.com

Senior Engineer - AI/ML

3-4 Yrs

Mumbai

19d ago·via naukri.com

Project Manager

5-7 Yrs

Mumbai

2mon ago·via naukri.com

Generative AI Tech Lead (Large Scale Text/Speech Models)

5-10 Yrs

Mumbai

5mon ago·via naukri.com

Senior Engineer - AI/ML

5-9 Yrs

Mumbai

5mon ago·via naukri.com
write
Share an Interview