Upload Button Icon Add office photos

Filter interviews by

KPMG India Data Engineer Interview Questions, Process, and Tips

Updated 5 Nov 2024

Top KPMG India Data Engineer Interview Questions and Answers

  • Q1. How do you handle changing schema from source. What are the common issues faced in hadoop and how did you resolve it?
  • Q2. Write Pyspark code to read csv file and show top 10 records.
  • Q3. What are the optimization techniques applied in pyspark code?
View all 16 questions

KPMG India Data Engineer Interview Experiences

12 interviews found

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 17 Oct 2024

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Sep 2024. There were 3 interview rounds.

Round 1 - Coding Test 

Some multiple choice, 2 sql and 2 python questions were asked

Round 2 - Technical 

(2 Questions)

  • Q1. Tell me about you project
  • Ans. 

    Developed a real-time data processing system for analyzing customer behavior

    • Used Apache Kafka for streaming data ingestion

    • Implemented data pipelines using Apache Spark for processing and analysis

    • Utilized Elasticsearch for storing and querying large volumes of data

    • Developed custom machine learning models for predictive analytics

  • Answered by AI
  • Q2. Optimising technique that you have used
  • Ans. 

    I have used partitioning and indexing to optimize query performance.

    • Implemented partitioning on large tables to improve query performance by limiting the data scanned

    • Created indexes on frequently queried columns to speed up data retrieval

    • Utilized clustering keys to physically organize data on disk for faster access

  • Answered by AI
Round 3 - Technical 

(2 Questions)

  • Q1. Window partition question was asked
  • Q2. Project related question

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Sep 2024. There were 2 interview rounds.

Round 1 - Technical 

(2 Questions)

  • Q1. Spark optimization techniques
  • Ans. 

    Spark optimization techniques involve partitioning, caching, and tuning resources for efficient data processing.

    • Partitioning data to distribute workload evenly

    • Caching frequently accessed data to avoid recomputation

    • Tuning resources like memory allocation and parallelism

    • Using broadcast variables for small lookup tables

  • Answered by AI
  • Q2. Data warehousing questions
Round 2 - Technical 

(2 Questions)

  • Q1. Project experience
  • Ans. 

    Developed a data pipeline to ingest, process, and analyze real-time streaming data from IoT devices.

    • Designed and implemented data ingestion process using Apache Kafka

    • Utilized Apache Spark for real-time data processing and analysis

    • Developed data models and algorithms to extract insights from the data

    • Worked with stakeholders to understand requirements and deliver actionable insights

  • Answered by AI
  • Q2. Challenges faced
  • Ans. 

    Some challenges faced include data quality issues, scalability issues, and keeping up with evolving technologies.

    • Data quality issues such as missing values, inconsistencies, and errors in data sources.

    • Scalability issues when dealing with large volumes of data and ensuring efficient processing.

    • Keeping up with evolving technologies and tools in the field of data engineering.

    • Collaborating with cross-functional teams and s...

  • Answered by AI

Skills evaluated in this interview

Data Engineer Interview Questions Asked at Other Companies

asked in Cisco
Q1. Optimal Strategy for a Coin Game You are playing a coin game with ... read more
asked in Sigmoid
Q2. Next Greater Element Problem Statement You are given an array arr ... read more
asked in Sigmoid
Q3. Problem: Search In Rotated Sorted Array Given a sorted array that ... read more
asked in Cisco
Q4. Covid Vaccination Distribution Problem As the Government ramps up ... read more
asked in LTIMindtree
Q5. 1) If you are given a card with 1-1000 numbers and there are 4 bo ... read more

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 12 Feb 2024

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
No response

I applied via Naukri.com and was interviewed in Jan 2024. There was 1 interview round.

Round 1 - One-on-one 

(5 Questions)

  • Q1. 1. What is columnar storage,parquet,delta? Why it is used
  • Ans. 

    Columnar storage is a data storage format that stores data in columns rather than rows, improving query performance.

    • Columnar storage stores data in a column-wise manner instead of row-wise.

    • It improves query performance by reducing the amount of data that needs to be read from disk.

    • Parquet is a columnar storage file format that is optimized for big data workloads.

    • It is used in Apache Spark and other big data processing ...

  • Answered by AI
  • Q2. 2.list ,tuple,set in python 3.sql groupby and window function ,union
  • Q3. 3. Explain detail project architecture
  • Ans. 

    The project architecture involves the design and organization of data pipelines and systems for efficient data processing and storage.

    • The architecture includes components such as data sources, data processing frameworks, storage systems, and data delivery mechanisms.

    • It focuses on scalability, reliability, and performance to handle large volumes of data.

    • Example: A project architecture may involve using Apache Kafka for ...

  • Answered by AI
  • Q4. 4. How to connect SQL server to databricks
  • Ans. 

    To connect SQL server to Databricks, use JDBC/ODBC drivers and configure the connection settings.

    • Install the appropriate JDBC/ODBC driver for SQL server

    • Configure the connection settings in Databricks

    • Use the JDBC/ODBC driver to establish the connection

  • Answered by AI
  • Q5. Optimisation techniques used
  • Ans. 

    Optimisation techniques used in data engineering

    • Partitioning data to improve query performance

    • Using indexing to speed up data retrieval

    • Implementing caching mechanisms to reduce data access time

    • Optimizing data storage formats for efficient storage and processing

    • Parallel processing and distributed computing for faster data processing

    • Using compression techniques to reduce storage space and improve data transfer

    • Applying qu...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for KPMG India Data Engineer interview:
  • Spark,
  • SQL
  • Python

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
-
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Recruitment Consulltant and was interviewed in May 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Question related to project
  • Q2. Sql medium level question

KPMG India interview questions for designations

 Azure Data Engineer

 (4)

 Data Analyst

 (10)

 Data Scientist

 (2)

 Data Executive

 (1)

 Data Science Consultant

 (2)

 Senior Data Analyst

 (1)

 Data Science Intern

 (1)

 Data Migration Specialist

 (1)

Data Engineer Interview Questions & Answers

user image Dark Prince

posted on 18 Jul 2024

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Pyspark question were asked
  • Q2. Sql questions were asked

Get interview-ready with Top KPMG India Interview Questions

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 14 Jun 2024

Interview experience
3
Average
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
No response

I applied via Approached by Company and was interviewed in May 2024. There was 1 interview round.

Round 1 - Technical 

(1 Question)

  • Q1. SCD type 2, snowflake pipe

Data Engineer Jobs at KPMG India

View all

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 10 Jun 2023

Interview experience
1
Bad
Difficulty level
-
Process Duration
2-4 weeks
Result
-

I applied via Approached by Company and was interviewed in May 2023. There were 2 interview rounds.

Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.
View all tips
Round 2 - Technical 

(2 Questions)

  • Q1. It's question on collect list - it should be a straight question instead it was asked like comma separated and in a vague way.
  • Q2. Few questions on triggers, integration runtime

Interview Preparation Tips

Interview preparation tips for other job seekers - Don't waste your time by attending this.
Few questions asked itself is wrong. Interviewer asking in her way of understanding.
Wishi Kaur is the interviewer and she is so rude and didn't know the way to ask questions and no respect over candidates.
She is thinking she knows everything.
It's a humble request to avoid these kind of interviewers.
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
-
Result
-

I applied via Approached by Company and was interviewed in Mar 2023. There were 2 interview rounds.

Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Don’t add your photo or details such as gender, age, and address in your resume. These details do not add any value.
View all tips
Round 2 - One-on-one 

(4 Questions)

  • Q1. Python Dictionary operations
  • Q2. Write function to check if number is an Armstrong Number
  • Ans. 

    Function to check if a number is an Armstrong Number

    • An Armstrong Number is a number that is equal to the sum of its own digits raised to the power of the number of digits

    • To check if a number is an Armstrong Number, we need to calculate the sum of each digit raised to the power of the number of digits

    • If the sum is equal to the original number, then it is an Armstrong Number

  • Answered by AI
  • Q3. How to initiate Sparkcontext
  • Ans. 

    To initiate Sparkcontext, create a SparkConf object and pass it to SparkContext constructor.

    • Create a SparkConf object with app name and master URL

    • Pass the SparkConf object to SparkContext constructor

    • Example: conf = SparkConf().setAppName('myApp').setMaster('local[*]') sc = SparkContext(conf=conf)

    • Stop SparkContext using sc.stop()

  • Answered by AI
  • Q4. RDDs vs DataFrames. Which is better and why
  • Ans. 

    DataFrames are better than RDDs due to their optimized performance and ease of use.

    • DataFrames are optimized for better performance than RDDs.

    • DataFrames have a schema, making it easier to work with structured data.

    • DataFrames support SQL queries and can be used with Spark SQL.

    • RDDs are more low-level and require more manual optimization.

    • RDDs are useful for unstructured data or when fine-grained control is needed.

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Be well versed with Pyspark concepts. Do not fake answers. You can accept not knowing somethings.

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 18 Oct 2022

I applied via Approached by Company and was interviewed in Sep 2022. There were 5 interview rounds.

Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Properly align and format text in your resume. A recruiter will have to spend more time reading poorly aligned text, leading to high chances of rejection.
View all tips
Round 2 - Coding Test 

It was a MCQ test to intepret codes and its outcomes.

Round 3 - Technical 

(4 Questions)

  • Q1. Given a dictionary, find out the greatest number for same key in Python.
  • Ans. 

    Find the greatest number for same key in a Python dictionary.

    • Use max() function with key parameter to find the maximum value for each key in the dictionary.

    • Iterate through the dictionary and apply max() function on each key.

    • If the dictionary is nested, use recursion to iterate through all the keys.

  • Answered by AI
  • Q2. Write Pyspark code to read csv file and show top 10 records.
  • Ans. 

    Pyspark code to read csv file and show top 10 records.

    • Import the necessary libraries

    • Create a SparkSession

    • Read the CSV file using the SparkSession

    • Display the top 10 records using the show() method

  • Answered by AI
  • Q3. Write pyspark code to change column name, divide one column by another column.
  • Ans. 

    Pyspark code to change column name and divide one column by another column.

    • Use 'withColumnRenamed' method to change column name

    • Use 'withColumn' method to divide one column by another column

    • Example: df = df.withColumnRenamed('old_col_name', 'new_col_name').withColumn('new_col_name', df['col1']/df['col2'])

  • Answered by AI
  • Q4. What are the optimization techniques applied in pyspark code?
  • Ans. 

    Optimization techniques in PySpark code include partitioning, caching, and using broadcast variables.

    • Partitioning data based on key columns to optimize join operations

    • Caching frequently accessed data in memory to avoid recomputation

    • Using broadcast variables to efficiently share small data across nodes

    • Using appropriate data types and avoiding unnecessary type conversions

    • Avoiding shuffling of data by using appropriate tr...

  • Answered by AI
Round 4 - Behavioral 

(1 Question)

  • Q1. How do you handle changing schema from source. What are the common issues faced in hadoop and how did you resolve it?
  • Ans. 

    Handling changing schema from source in Hadoop

    • Use schema evolution techniques like Avro or Parquet to handle schema changes

    • Implement a flexible ETL pipeline that can handle schema changes

    • Use tools like Apache NiFi to dynamically adjust schema during ingestion

    • Common issues include data loss, data corruption, and performance degradation

    • Resolve issues by implementing proper testing, monitoring, and backup strategies

  • Answered by AI
Round 5 - HR 

(1 Question)

  • Q1. Mostly it was salary negotiation

Interview Preparation Tips

Interview preparation tips for other job seekers - Be confident and pratice python, pyspark, sql for big data ppsitiom

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 10 Aug 2024

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed before Aug 2023. There were 2 interview rounds.

Round 1 - Technical 

(2 Questions)

  • Q1. What is integration run time in adf
  • Ans. 

    Integration run time in ADF is a compute infrastructure used to run activities in Azure Data Factory pipelines.

    • Integration run time is a managed compute infrastructure in Azure Data Factory.

    • It is used to run activities within pipelines, such as data movement or data transformation tasks.

    • Integration run time can be auto-scaled based on the workload requirements.

    • It supports various data integration scenarios, including b...

  • Answered by AI
  • Q2. How do you copy data from on-premise to azure cloud
  • Ans. 

    Data can be copied from on-premise to Azure cloud using various methods like Azure Data Factory, Azure Storage Explorer, Azure Data Migration Service, etc.

    • Use Azure Data Factory to create data pipelines for moving data from on-premise to Azure cloud

    • Utilize Azure Storage Explorer to manually copy data from on-premise to Azure Blob Storage

    • Leverage Azure Data Migration Service for migrating large volumes of data from on-p...

  • Answered by AI
Round 2 - HR 

(2 Questions)

  • Q1. How many members do you have in your family
  • Ans. 

    I have 4 members in my family including my parents, my sibling, and myself.

    • I have 2 parents

    • I have 1 sibling

    • I am included in the count

  • Answered by AI
  • Q2. Are ready to relocate
  • Ans. 

    Yes, I am open to relocating for the right opportunity.

    • I am willing to relocate for the right job opportunity

    • I am open to exploring new locations and experiences

    • I understand the importance of being flexible in the job market

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - prepare everything

Skills evaluated in this interview

KPMG India Interview FAQs

How many rounds are there in KPMG India Data Engineer interview?
KPMG India interview process usually has 2-3 rounds. The most common rounds in the KPMG India interview process are Technical, Resume Shortlist and Coding Test.
How to prepare for KPMG India Data Engineer interview?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at KPMG India. The most common topics and skills that interviewers at KPMG India expect are SQL, Python, Azure, Data Integration and ETL.
What are the top questions asked in KPMG India Data Engineer interview?

Some of the top questions asked at the KPMG India Data Engineer interview -

  1. How do you handle changing schema from source. What are the common issues faced...read more
  2. Write Pyspark code to read csv file and show top 10 recor...read more
  3. What are the optimization techniques applied in pyspark co...read more
How long is the KPMG India Data Engineer interview process?

The duration of KPMG India Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.

Tell us how to improve this page.

KPMG India Data Engineer Interview Process

based on 10 interviews

2 Interview rounds

  • Technical Round - 1
  • Technical Round - 2
View more

Data Engineer Interview Questions from Similar Companies

View all
KPMG India Data Engineer Salary
based on 103 salaries
₹7 L/yr - ₹27.4 L/yr
58% more than the average Data Engineer Salary in India
View more details

KPMG India Data Engineer Reviews and Ratings

based on 9 reviews

4.0/5

Rating in categories

4.0

Skill development

3.4

Work-life balance

3.3

Salary

3.7

Job security

3.2

Company culture

3.8

Promotions

3.2

Work satisfaction

Explore 9 Reviews and Ratings
Data Engineer - Consultant

Bangalore / Bengaluru

4-6 Yrs

Not Disclosed

Data Engineer - Manager

Bangalore / Bengaluru

9-11 Yrs

Not Disclosed

Data Engineer - Consultant

Bangalore / Bengaluru

3-6 Yrs

Not Disclosed

Explore more jobs
Consultant
7.7k salaries
unlock blur

₹6.5 L/yr - ₹27 L/yr

Assistant Manager
6.9k salaries
unlock blur

₹10.3 L/yr - ₹35.1 L/yr

Associate Consultant
4.6k salaries
unlock blur

₹4.5 L/yr - ₹16 L/yr

Analyst
3.5k salaries
unlock blur

₹1 L/yr - ₹9.7 L/yr

Manager
2.9k salaries
unlock blur

₹15.9 L/yr - ₹50 L/yr

Explore more salaries
Compare KPMG India with

Cognizant

3.8
Compare

PwC

3.4
Compare

Capgemini

3.7
Compare

Google

4.4
Compare
Did you find this page helpful?
Yes No
write
Share an Interview