Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Engaged Employer

Tech Mahindra

Compare

3.5

based on 38.3k Reviews

Video summary

Filter interviews by

Tech Mahindra Data Engineer Interview Questions and Answers

Updated 25 Apr 2025

16 Interview questions

A Data Engineer was asked 1mo ago

Q. What are Spark optimization techniques?

Ans.

Spark optimization techniques enhance performance and resource utilization in Apache Spark applications.

Use DataFrames and Datasets for optimized execution plans.
Leverage Catalyst Optimizer for query optimization.
Apply partitioning to distribute data evenly across nodes.
Utilize caching and persistence to store frequently accessed data.
Optimize joins by using broadcast joins for smaller datasets.
Adjust Spark config...

A Data Engineer was asked 3mo ago

Q. How do you create an external table?

Ans.

Creating an external table allows querying data stored outside the database using SQL.

Use the CREATE EXTERNAL TABLE statement to define the table structure.
Specify the location of the external data, e.g., S3 bucket or HDFS.
Define the schema with appropriate data types for each column.
Example: CREATE EXTERNAL TABLE my_table (id INT, name STRING) LOCATION 's3://my-bucket/data/'
External tables can be used for data la...

A Data Engineer was asked 3mo ago

Q. Given a list, how would you print the count of duplicates for each distinct element?

Ans.

Count duplicates of each distinct element in a list using Python.

Use a dictionary to store counts: {'element': count}.
Example: For list [1, 2, 2, 3], output is {1: 1, 2: 2, 3: 1}.
Utilize collections.Counter for a concise solution.
Example: Counter([1, 2, 2, 3]) gives Counter({2: 2, 1: 1, 3: 1}).
Iterate through the dictionary to print counts.

A Data Engineer was asked 5mo ago

Q. Explain the Databricks architecture.

Ans.

Data bricks architecture is a cloud-based big data processing platform that combines Apache Spark and Delta Lake.

Data bricks architecture includes Apache Spark for processing big data in a distributed environment.
It also incorporates Delta Lake for reliable data lakes and data warehousing.
Data bricks provides a collaborative workspace for data engineers, data scientists, and analysts.
It offers automated cluster ma...

What people are saying about Tech Mahindra

View All

a senior engineer

💼 OFFER RECEIVED – Sr. Test Engineer (Band U3) | Tech Mahindra | Noida 📎 Screenshot attached | CTC: ₹13.5 LPA

✅ Variable is paid monthly and fully (as confirmed by HR) ❓ Looking to know the MONTHLY IN-HAND SALARY after standard deductions & partial FBP usage Would appreciate any insights from current/ex-TechM folks! 🙏

Got a question about Tech Mahindra?

Ask anonymously on communities.

A Data Engineer was asked 5mo ago

Q. What is the SQL query to calculate the average sales over a period of 7 days?

Ans.

Calculate average sales over a 7-day period using SQL query.

Use the AVG() function to calculate the average sales.
Filter the data based on the date range of the 7-day period using WHERE clause.
Group the data by date to calculate the average sales for each day.
Join the tables if necessary to get the sales data.

A Data Engineer was asked 6mo ago

Q. What are the differences between Dataflow and Dataproc?

Ans.

Dataflow is a fully managed stream and batch processing service, while Dataproc is a managed Apache Spark and Hadoop service.

Dataflow is a serverless data processing service that automatically scales to handle your data, while Dataproc is a managed Spark and Hadoop service that requires you to provision and manage clusters.
Dataflow is designed for both batch and stream processing, allowing you to process data in r...

A Data Engineer was asked 6mo ago

Q. What is the difference between Bigtable and BigQuery?

Ans.

Bigtable is a NoSQL database for real-time analytics, while BigQuery is a fully managed data warehouse for running SQL queries.

Bigtable is a NoSQL database designed for real-time analytics and high throughput, while BigQuery is a fully managed data warehouse for running SQL queries.
Bigtable is used for storing large amounts of semi-structured data, while BigQuery is used for analyzing structured data using SQL que...

Are these interview questions helpful?

A Data Engineer was asked 6mo ago

Q. Which operator in Composer is used to move data from GCS to BigQuery?

Ans.

The operator used in Composer to move data from GCS to BigQuery is the GCS to BigQuery operator.

The GCS to BigQuery operator is used in Apache Airflow, which is the underlying technology of Composer.
This operator allows you to transfer data from Google Cloud Storage (GCS) to BigQuery.
You can specify the source and destination parameters in the operator to define the data transfer process.

A Data Engineer was asked 6mo ago

Q. What are the differences between RDD, DataFrame, and Dataset in PySpark?

Ans.

RDD vs dataframe vs dataset in PySpark

RDD (Resilient Distributed Dataset) is the basic abstraction in PySpark, representing a distributed collection of objects
Dataframe is a distributed collection of data organized into named columns, similar to a table in a relational database
Dataset is a distributed collection of data with the ability to use custom classes for type safety and user-defined functions
Dataframes and...

A Data Engineer was asked 6mo ago

Q. Write code to transform the input list [1,2,3,4] to the output list [1,4,9,16].

Ans.

Code to square each element in the input array.

Iterate through the input array and square each element.
Store the squared values in a new array to get the desired output.

Tech Mahindra Data Engineer Interview Experiences

16 interviews found

Data Engineer Interview Questions & Answers

Anonymous

posted on 23 Jan 2025

Interview experience

Bad

Difficulty level

Moderate

Process Duration

6-8 weeks

Result

Not Selected

I appeared for an interview in Dec 2024.

Round 1 - Technical

(2 Questions)

Q1. They asked me regarding project

Add your answer

Q2. Sql queries and data bricks related questions

Add your answer

Round 2 - Technical

(1 Question)

Q1. Details explaination on project that I have worked and optimization techniques

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - I have successfully cleared the L1 and L2 rounds, and my background verification has also been completed. I was informed that an offer would be sent the next day. However, the HR then stated that there would be a discussion solely about the CTC, with no technical questions involved. I agreed to this; yet, the interviewer proceeded to ask high-level technical questions that I had not worked on, leading to the final decision of not selecting me. It felt like a fake interview process, accompanied by delayed communication and a lack of response, resulting in a very negative experience with Tech Mahindra.

Data Engineer Interview Questions & Answers

Rutuja Bhagat

posted on 15 Dec 2024

Interview experience

Good

Difficulty level

Easy

Process Duration

Result

I applied via Recruitment Consulltant and was interviewed in Nov 2024. There was 1 interview round.

Round 1 - Technical

(7 Questions)

Q1. Difference between bigtable and bigquery.

Ans.

Bigtable is a NoSQL database for real-time analytics, while BigQuery is a fully managed data warehouse for running SQL queries.

Bigtable is a NoSQL database designed for real-time analytics and high throughput, while BigQuery is a fully managed data warehouse for running SQL queries.
Bigtable is used for storing large amounts of semi-structured data, while BigQuery is used for analyzing structured data using SQL queries.
...

Answered by AI

Add your answer

Q2. How to remove duplicate rows from bigquery? find the month of a given date in bigquery.

Ans.

To remove duplicate rows from BigQuery, use the DISTINCT keyword. To find the month of a given date, use the EXTRACT function.

To remove duplicate rows, use SELECT DISTINCT * FROM table_name;
To find the month of a given date, use SELECT EXTRACT(MONTH FROM date_column) AS month_name FROM table_name;
Make sure to replace 'table_name' and 'date_column' with the appropriate values in your query.

Answered by AI

Add your answer

Q3. What operator is used in composer to move data from gcs to bq

Ans.

The operator used in Composer to move data from GCS to BigQuery is the GCS to BigQuery operator.

The GCS to BigQuery operator is used in Apache Airflow, which is the underlying technology of Composer.
This operator allows you to transfer data from Google Cloud Storage (GCS) to BigQuery.
You can specify the source and destination parameters in the operator to define the data transfer process.

Answered by AI

Add your answer

Q4. Write a code for this - input = [1,2,3,4] output = [1,4,9,16]

Ans.

Code to square each element in the input array.

Iterate through the input array and square each element.
Store the squared values in a new array to get the desired output.

Answered by AI

Add your answer

Q5. Dataflow vs dataproc.

Ans.

Dataflow is a fully managed stream and batch processing service, while Dataproc is a managed Apache Spark and Hadoop service.

Dataflow is a serverless data processing service that automatically scales to handle your data, while Dataproc is a managed Spark and Hadoop service that requires you to provision and manage clusters.
Dataflow is designed for both batch and stream processing, allowing you to process data in real-t...

Answered by AI

Add your answer

Q6. Architecture of bq. Query optimization techniques in bigquery.

Ans.

BigQuery architecture includes storage, execution, and optimization components for efficient query processing.

BigQuery stores data in Capacitor storage system for fast access.
Query execution is distributed across multiple nodes for parallel processing.
Query optimization techniques include partitioning tables, clustering tables, and using query cache.
Using partitioned tables can help eliminate scanning unnecessary data.
...

Answered by AI

Add your answer

Q7. RDD vs dataframe vs dataset in pyspark

Ans.

RDD vs dataframe vs dataset in PySpark

RDD (Resilient Distributed Dataset) is the basic abstraction in PySpark, representing a distributed collection of objects
Dataframe is a distributed collection of data organized into named columns, similar to a table in a relational database
Dataset is a distributed collection of data with the ability to use custom classes for type safety and user-defined functions
Dataframes and Data...

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 3 Jan 2025

Interview experience

Good

Difficulty level

Moderate

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. What is the SQL query to calculate the average sales over a period of 7 days?

Ans.

Calculate average sales over a 7-day period using SQL query.

Use the AVG() function to calculate the average sales.
Filter the data based on the date range of the 7-day period using WHERE clause.
Group the data by date to calculate the average sales for each day.
Join the tables if necessary to get the sales data.

Answered by AI

Add your answer

Q2. Explain Data bricks architecture?

Ans.

Data bricks architecture is a cloud-based big data processing platform that combines Apache Spark and Delta Lake.

Data bricks architecture includes Apache Spark for processing big data in a distributed environment.
It also incorporates Delta Lake for reliable data lakes and data warehousing.
Data bricks provides a collaborative workspace for data engineers, data scientists, and analysts.
It offers automated cluster managem...

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 27 Feb 2025

Interview experience

Good

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

No response

I appeared for an interview in Jan 2025.

Round 1 - Technical

(2 Questions)

Q1. Create external table

Ans.

Creating an external table allows querying data stored outside the database using SQL.

Use the CREATE EXTERNAL TABLE statement to define the table structure.
Specify the location of the external data, e.g., S3 bucket or HDFS.
Define the schema with appropriate data types for each column.
Example: CREATE EXTERNAL TABLE my_table (id INT, name STRING) LOCATION 's3://my-bucket/data/'
External tables can be used for data lakes a...

Answered by AI

Add your answer

Q2. Print count of duplicates of each distinct element in list

Ans.

Count duplicates of each distinct element in a list using Python.

Use a dictionary to store counts: {'element': count}.
Example: For list [1, 2, 2, 3], output is {1: 1, 2: 2, 3: 1}.
Utilize collections.Counter for a concise solution.
Example: Counter([1, 2, 2, 3]) gives Counter({2: 2, 1: 1, 3: 1}).
Iterate through the dictionary to print counts.

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 25 Apr 2025

Interview experience

Excellent

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Not Selected

I appeared for an interview in Mar 2025, where I was asked the following questions.

Q1. SQL query: Calculate the running_total of sales_amount group by department.

Ans.

Calculate the running total of sales_amount grouped by department using SQL for cumulative sales analysis.

Use the SUM() function with the OVER() clause to calculate running totals.
Example SQL: SELECT department, sales_amount, SUM(sales_amount) OVER (PARTITION BY department ORDER BY sale_date) AS running_total FROM sales;
The PARTITION BY clause divides the result set into partitions to which the SUM() function is applie...

Answered by AI

Add your answer

Q2. What are spark optimization techniques?

Ans.

Spark optimization techniques enhance performance and resource utilization in Apache Spark applications.

Use DataFrames and Datasets for optimized execution plans.
Leverage Catalyst Optimizer for query optimization.
Apply partitioning to distribute data evenly across nodes.
Utilize caching and persistence to store frequently accessed data.
Optimize joins by using broadcast joins for smaller datasets.
Adjust Spark configurati...

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 11 Dec 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via Naukri.com and was interviewed in Jun 2024. There was 1 interview round.

Round 1 - Technical

(2 Questions)

Q1. Sql scenarios based questions

Add your answer

Q2. Azure data engineering questions

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 12 Dec 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Jun 2024. There were 2 interview rounds.

Round 1 - Coding Test

Questions related to Database and ETL Process

Round 2 - HR

(1 Question)

Q1. Why you want to join

Ans.

I am impressed by your innovative projects and collaborative work culture.

I admire your organization's commitment to cutting-edge technology and data-driven solutions.
I am excited about the opportunity to work with a team of talented professionals who share my passion for data engineering.
Your organization's reputation for fostering a collaborative and inclusive work environment is appealing to me.

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 12 Sep 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Technical

(1 Question)

Q1. Mostly on cloud tools

Add your answer

Data Engineer Interview Questions & Answers

Anonymous

posted on 3 Oct 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I appeared for an interview in Apr 2024.

Round 1 - Coding Test

Pyspark coding questions and sql

Interview Preparation Tips

Interview preparation tips for other job seekers - Two rounds of Interview one is basic then technical and finally hr.

Data Engineer Interview Questions & Answers

Ashwini Malviya

posted on 8 Aug 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - One-on-one

(1 Question)

Q1. How bigquery works?

Ans.

BigQuery is a fully managed, serverless data warehouse by Google Cloud for analyzing large datasets using SQL queries.

BigQuery is a cloud-based data warehouse that allows for fast SQL queries on large datasets.
It is fully managed and serverless, meaning users do not have to worry about infrastructure management.
BigQuery can handle petabytes of data and allows for real-time analytics with its streaming capabilities.
It s...

Answered by AI

Add your answer

Tech Mahindra Interview FAQs

How many rounds are there in Tech Mahindra Data Engineer interview?

Tech Mahindra interview process usually has 1-2 rounds. The most common rounds in the Tech Mahindra interview process are Technical, Resume Shortlist and Coding Test.

How to prepare for Tech Mahindra Data Engineer interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Tech Mahindra. The most common topics and skills that interviewers at Tech Mahindra expect are Python, SQL, Spark, ETL and AWS.

What are the top questions asked in Tech Mahindra Data Engineer interview?

Some of the top questions asked at the Tech Mahindra Data Engineer interview -

how to remove duplicate rows from bigquery? find the month of a given date in b...read more
what operator is used in composer to move data from gcs to...read more
What is the SQL query to calculate the average sales over a period of 7 da...read more

How long is the Tech Mahindra Data Engineer interview process?

The duration of Tech Mahindra Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.

Tell us how to improve this page.

Tech Mahindra Interviews By Designations

Interview Questions for Popular Designations

3.9/5

based on 15 interview experiences

Difficulty level

Easy 50%

Moderate 50%

Duration

Less than 2 weeks 88%

6-8 weeks 13%

TCS Data Engineer Interview Questions

3.6

• 97 Interviews

Accenture Data Engineer Interview Questions

3.8

• 80 Interviews

LTIMindtree Data Engineer Interview Questions

3.7

• 63 Interviews

IBM Data Engineer Interview Questions

4.0

• 41 Interviews

Capgemini Data Engineer Interview Questions

3.7

• 37 Interviews

Cognizant Data Engineer Interview Questions

3.7

• 32 Interviews

Infosys Data Engineer Interview Questions

3.6

• 30 Interviews

Wipro Data Engineer Interview Questions

3.7

• 26 Interviews

HCLTech Data Engineer Interview Questions

3.5

• 13 Interviews

Genpact Data Engineer Interview Questions

3.8

• 9 Interviews

View all

Tech Mahindra Data Engineer Salary

based on 775 salaries

₹3.9 L/yr - ₹15.4 L/yr

16% less than the average Data Engineer Salary in India

View more details

Data Engineer Jobs at Tech Mahindra

Data Engineer

Hyderabad / Secunderabad

4-9 Yrs

Not Disclosed

Explore more jobs

Tech Mahindra Salaries in India

Software Engineer 26.7k salaries	₹3.7 L/yr - ₹9.2 L/yr
Senior Software Engineer 22.1k salaries	₹9 L/yr - ₹18.5 L/yr
Technical Lead 12.5k salaries	₹16.9 L/yr - ₹30 L/yr
Associate Software Engineer 6.1k salaries	₹1.9 L/yr - ₹5.7 L/yr
Team Lead 5.3k salaries	₹6.5 L/yr - ₹17.9 L/yr