Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 1K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

RATE NOW!
- ABECA 2025
  
  RATE NOW!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

PwC

Compare

3.4

based on 8.5k Reviews

Filter interviews by

PwC Big Data Engineer Interview Questions, Process, and Tips

Updated 18 Jul 2024

PwC Big Data Engineer Interview Experiences

1 interview found

Big Data Engineer Interview Questions & Answers

Anonymous

posted on 18 Jul 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

I applied via Naukri.com and was interviewed in Jun 2024. There was 1 interview round.

Round 1 - One-on-one

(11 Questions)

Q1. Working Experienace in current project

Add your answer

Q2. If i have large dataset to load which will not fit into the memory, How will you load the file?

Add your answer

Q3. What is Apache spark?

Ans.

Apache Spark is an open-source distributed computing system that provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Apache Spark is designed for speed and ease of use in processing large amounts of data.
It can run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
Spark provides high-level APIs in Java, Scala, Python, and R, and an opt...

Answered by AI

Add your answer

Q4. What are core components of spark?

Ans.

Core components of Spark include Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX.

Spark Core: foundation of the Spark platform, provides basic functionality for distributed data processing
Spark SQL: module for working with structured data using SQL and DataFrame API
Spark Streaming: extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams
MLlib...

Answered by AI

Add your answer

Q5. If we have streaming data coming from kafka and spark , how will you handle fault tolerance?

Ans.

Implement fault tolerance by using checkpointing, replication, and monitoring mechanisms.

Enable checkpointing in Spark Streaming to save the state of the computation periodically to a reliable storage like HDFS or S3.
Use replication in Kafka to ensure that data is not lost in case of node failures.
Monitor the health of the Kafka and Spark clusters using tools like Prometheus and Grafana to detect and address issues pro

Answered by AI

Add your answer

Q6. What is hive Architecture?

Ans.

Hive Architecture is a data warehousing infrastructure built on top of Hadoop for querying and analyzing large datasets.

Hive uses a language called HiveQL which is similar to SQL for querying data stored in Hadoop.
It organizes data into tables, partitions, and buckets to optimize queries and improve performance.
Hive metastore stores metadata about tables, columns, partitions, and their locations.
Hive queries are conver...

Answered by AI

Add your answer

Q7. What is vectorization in ?

Ans.

Vectorization is the process of converting data into a format that can be easily processed by a computer's CPU or GPU.

Vectorization allows for parallel processing of data, improving computational efficiency.
It involves performing operations on entire arrays or matrices at once, rather than on individual elements.
Examples include using libraries like NumPy in Python to perform vectorized operations on arrays.
Vectorizati...

Answered by AI

Add your answer

Q8. We have to do Vectorization?

Add your answer

Q9. What is partition in hive?

Ans.

Partition in Hive is a way to organize data in a table into multiple directories based on the values of one or more columns.

Partitions help in improving query performance by allowing Hive to only read the relevant data directories.
Partitions are defined when creating a table in Hive using the PARTITIONED BY clause.
Example: CREATE TABLE table_name (column1 INT, column2 STRING) PARTITIONED BY (column3 STRING);

Answered by AI

Add your answer

Q10. What are functions in SQL?

Ans.

Functions in SQL are built-in operations that can be used to manipulate data or perform calculations within a database.

Functions in SQL can be used to perform operations on data, such as mathematical calculations, string manipulation, date/time functions, and more.
Examples of SQL functions include SUM(), AVG(), CONCAT(), UPPER(), LOWER(), DATE_FORMAT(), and many others.
Functions can be used in SELECT statements, WHERE ...

Answered by AI

Add your answer

Q11. Explain Rank, Dense_rank , row_number

Ans.

Rank, Dense_rank, and row_number are window functions used in SQL to assign a rank to each row based on a specified order.

Rank function assigns a unique rank to each row based on the specified order.
Dense_rank function assigns a unique rank to each row without any gaps based on the specified order.
Row_number function assigns a unique sequential integer to each row based on the specified order.

Answered by AI

Add your answer

Skills evaluated in this interview

Big Data Engineer Jobs at PwC

View all

IN-Manager_Big Data Engineer_Data and Analytics_Advisory_Bengaluru

Bangalore / Bengaluru

3-7 Yrs

Not Disclosed

IN-Manager_Big Data Engineer_Data and Analytics_Advisory_Bengaluru

Bangalore / Bengaluru

3-7 Yrs

Not Disclosed

Interview questions from similar companies

Big Data Engineer Interview Questions & Answers

Deloitte

Anonymous

posted on 12 Aug 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

Round 1 - Technical

(1 Question)

Q1. Partitioning, broadcast join

Add your answer

Round 2 - One-on-one

(1 Question)

Q1. Client round interview questions

Add your answer

Round 3 - HR

(1 Question)

Q1. Salary negotiation

Add your answer

Big Data Engineer Interview Questions & Answers

Ernst & Young

Anonymous

posted on 18 Mar 2024

Interview experience

Average

Difficulty level

Hard

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Feb 2024. There was 1 interview round.

Round 1 - Technical

(1 Question)

Q1. What is explode function ?

Ans.

explode function is used in Apache Spark to split a column containing arrays into multiple rows.

Used in Apache Spark to split a column containing arrays into multiple rows
Creates a new row for each element in the array
Syntax: explode(col: Column): Column
Example: df.select(explode(col('array_column')))

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Cover your basics

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

KPMG India

Anonymous

posted on 17 Oct 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Sep 2024. There were 3 interview rounds.

Round 1 - Coding Test

Some multiple choice, 2 sql and 2 python questions were asked

Round 2 - Technical

(2 Questions)

Q1. Tell me about you project

Ans.

Developed a real-time data processing system for analyzing customer behavior

Used Apache Kafka for streaming data ingestion
Implemented data pipelines using Apache Spark for processing and analysis
Utilized Elasticsearch for storing and querying large volumes of data
Developed custom machine learning models for predictive analytics

Answered by AI

Add your answer

Q2. Optimising technique that you have used

Ans.

I have used partitioning and indexing to optimize query performance.

Implemented partitioning on large tables to improve query performance by limiting the data scanned
Created indexes on frequently queried columns to speed up data retrieval
Utilized clustering keys to physically organize data on disk for faster access

Answered by AI

Add your answer

Round 3 - Technical

(2 Questions)

Q1. Window partition question was asked

Add your answer

Q2. Project related question

Add your answer

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Bain & Company

Anonymous

posted on 5 Sep 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

4-6 weeks

Result

Not Selected

I applied via Company Website and was interviewed in Aug 2024. There were 2 interview rounds.

Round 1 - One-on-one

(2 Questions)

Q1. Project related discussions

Add your answer

Q2. Meduim level SQl and DSA

Add your answer

Round 2 - One-on-one

(2 Questions)

Q1. This was data modelling round

Add your answer

Q2. Design a uber data model

Ans.

Uber data model design for efficient storage and retrieval of ride-related information.

Create tables for users, drivers, rides, payments, and ratings
Include attributes like user_id, driver_id, ride_id, payment_id, rating_id, timestamp, location, fare, etc.
Establish relationships between tables using foreign keys
Implement indexing for faster query performance

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare SQl, Python and data modeling

Skills evaluated in this interview

Data Analyst Interview Questions & Answers

Blackrock

tejas jain

posted on 18 Sep 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Newspaper Ad and was interviewed in Aug 2024. There were 3 interview rounds.

Round 1 - Aptitude Test

Three sections are there 1) Aptitude Test 2) SQL 3) DSA

Round 2 - Technical

(2 Questions)

Q1. What is DSA , sorting , difference between array and linked list

Ans.

DSA stands for Data Structures and Algorithms. Sorting is the process of arranging data in a particular order. Array is a data structure that stores elements of the same data type in contiguous memory locations, while linked list is a data structure that stores elements in nodes with pointers to the next node.

DSA stands for Data Structures and Algorithms
Sorting is the process of arranging data in a particular order
Arra...

Answered by AI

Add your answer

Q2. Written a SQL query

Add your answer

Round 3 - HR

(2 Questions)

Q1. Coding question like add numbers

Add your answer

Q2. Experience on your project

Ans.

I have experience working on various data analysis projects, including market research, customer segmentation, and predictive modeling.

Developed predictive models to forecast customer behavior and optimize marketing strategies
Conducted market research to identify trends and opportunities for growth
Performed customer segmentation analysis to target specific demographics with personalized marketing campaigns

Answered by AI

Add your answer

Skills evaluated in this interview

Data Analyst Interview Questions & Answers

BCG

Anonymous

posted on 25 Jul 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. Power BI Difference between ALL() and ALLSELECTED()

Ans.

ALL() ignores all filters in the query context, while ALLSELECTED() ignores only filters on columns in the visual.

ALL() removes all filters from the specified column or table.
ALLSELECTED() removes filters from the specified column or table, but keeps filters on other columns in the visual.
Example: ALL('Table') would remove all filters on the 'Table' in the query context.
Example: ALLSELECTED('Column') would remove filte...

Answered by AI

Add your answer

Q2. Excel Difference between COUNT() and COUNTA()

Ans.

COUNT() counts only numeric values, while COUNTA() counts all non-empty cells.

COUNT() counts only cells with numerical values.
COUNTA() counts all non-empty cells, including text and errors.
Example: COUNT(A1:A5) will count only cells with numbers, while COUNTA(A1:A5) will count all non-empty cells.

Answered by AI

Add your answer

Round 2 - Technical

(2 Questions)

Q1. Resume based questions like explain the projects that you have done.

Add your answer

Q2. Sample dataset questions

Add your answer

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Anonymous

posted on 14 Oct 2024

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. Project explaining

Add your answer

Q2. Write code to generate csv file with note pad data

Ans.

Code to generate a CSV file with notepad data

Open a text file and read the data
Parse the data and write it to a CSV file
Use libraries like pandas in Python for easier CSV handling

Answered by AI

Add your answer

Data Engineer Interview Questions & Answers

Gartner

Anonymous

posted on 13 Sep 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Approached by Company and was interviewed in Aug 2024. There was 1 interview round.

Round 1 - Coding Test

Maxium sub string and reverse a string

Data Analyst Interview Questions & Answers

The Smart Cube

Anonymous

posted on 17 Apr 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via campus placement at Lady Shri Ram College for Women, Delhi

Round 1 - Aptitude Test

Basic English, Quants and Statistics

Round 2 - Group Discussion

Easy, relevant to pandemic

Round 3 - Technical

(1 Question)

Q1. Python, Tableau, SQL, Stats, ML, all questions easy to medium level

Add your answer

Round 4 - One-on-one

(2 Questions)

Q1. Behavioural Questions

Add your answer

Q2. Statistics Case Study

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Good and organized interview process for the post of Data Analyst

PwC Interview FAQs

PwC interview process usually has 1 rounds. The most common rounds in the PwC interview process are One-on-one Round.

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at PwC. The most common topics and skills that interviewers at PwC expect are Python, SQL, Big Data, Spark and Hadoop.

Some of the top questions asked at the PwC Big Data Engineer interview -

If we have streaming data coming from kafka and spark , how will you handle fa...read more
What are core components of spa...read more
What is Apache spa...read more

Tell us how to improve this page.

PwC Interviews By Designations

Interview Questions for Popular Designations

People are getting interviews through

based on 1 PwC interview

Job Portal

100%

Low Confidence

Deloitte Interview Questions

3.8

• 2.8k Interviews

Ernst & Young Interview Questions

3.5

• 1.1k Interviews

KPMG India Interview Questions

3.5

• 767 Interviews

ZS Interview Questions

3.4

• 458 Interviews

McKinsey & Company Interview Questions

3.9

• 252 Interviews

BCG Interview Questions

3.8

• 190 Interviews

Bain & Company Interview Questions

3.8

• 101 Interviews

Blackrock Interview Questions

3.8

• 101 Interviews

Grant Thornton Interview Questions

3.7

• 95 Interviews

WSP Interview Questions

4.3

• 88 Interviews

View all

Institute of Chartered Accountant of India (ICAI) Placement Questions

23 Interviews

National Institute of Industrial Engineering (NITIE) Placement Questions

13 Interviews

Indian Institute of Management (IIM), Lucknow Placement Questions

11 Interviews

Indian Institute of Technology (IIT), Chennai Placement Questions

7 Interviews

Vellore Institute of Technology (VIT) Placement Questions

7 Interviews

Birla Institute of Technology (BIT), Ranchi Placement Questions

5 Interviews

Jadavpur University Placement Questions

4 Interviews

View all

PwC Big Data Engineer Salary

based on 23 salaries

₹5.1 L/yr - ₹21 L/yr

13% more than the average Big Data Engineer Salary in India

View more details

Big Data Engineer Jobs at PwC

IN-Manager_Big Data Engineer_Data and Analytics_Advisory_Bengaluru

Bangalore / Bengaluru

3-7 Yrs

Not Disclosed

IN-Manager_Big Data Engineer_Data and Analytics_Advisory_Bengaluru

Bangalore / Bengaluru

3-7 Yrs

Not Disclosed

Explore more jobs

PwC Salaries in India

Senior Associate 14.5k salaries	₹8 L/yr - ₹30 L/yr
Associate 12.6k salaries	₹4.5 L/yr - ₹16 L/yr
Manager 6.6k salaries	₹13.4 L/yr - ₹50 L/yr
Senior Consultant 4.4k salaries	₹8.9 L/yr - ₹32 L/yr
Associate2 4.1k salaries	₹4.5 L/yr - ₹16.5 L/yr

Explore more salaries

Deloitte

3.8

Compare

Ernst & Young

3.5

Compare

Accenture

3.9

Compare

TCS

3.7

Compare

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary

Did you find this page helpful?

Yes No

Home >
Interviews >
PwC Interview Questions >
PwC Big Data Engineer Interview Questions

Share an Interview

PwC Big Data Engineer Interview Questions, Process, and Tips

PwC Big Data Engineer Interview Experiences

1 interview found

(11 Questions)

Skills evaluated in this interview

Big Data Engineer Jobs at PwC

Interview questions from similar companies

(1 Question)

(1 Question)

(1 Question)

(1 Question)

Interview Preparation Tips

Skills evaluated in this interview

(2 Questions)

(2 Questions)

Skills evaluated in this interview

(2 Questions)

(2 Questions)

Interview Preparation Tips

Skills evaluated in this interview

(2 Questions)

(2 Questions)

Skills evaluated in this interview

(2 Questions)

(2 Questions)

Skills evaluated in this interview

(2 Questions)

(1 Question)

(2 Questions)

Interview Preparation Tips

PwC Interview FAQs

Tell us how to improve this page.

PwC Interviews By Designations

Interview Questions for Popular Designations

People are getting interviews through

Interview Questions from Similar Companies

Fast track your campus placements

Calculate your in-hand salary