Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Engaged Employer

Coforge

Compare

3.3

based on 4.6k Reviews

Filter interviews by

Coforge Big Data Engineer Interview Questions and Answers

Updated 30 Oct 2023

Coforge Big Data Engineer Interview Experiences

2 interviews found

Big Data Engineer Interview Questions & Answers

Anonymous

posted on 24 Jul 2023

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Selected

I applied via LinkedIn and was interviewed in Jun 2023. There were 4 interview rounds.

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.

View all tips

Round 2 - Technical

(1 Question)

Q1. Project overview and spark architecture questions,scala coding questions

Add your answer

Round 3 - Technical

(1 Question)

Q1. Project overview Scala and AWS services questions

Add your answer

Round 4 - HR

(1 Question)

Q1. Asked about multiple switches And experience

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare more on spark internals

Big Data Engineer Interview Questions & Answers

Harihara Sudan K

posted on 30 Oct 2023

Interview experience

Good

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Not Selected

I applied via Approached by Company and was interviewed in Sep 2023. There were 3 interview rounds.

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Don’t add your photo or details such as gender, age, and address in your resume. These details do not add any value.

View all tips

Round 2 - Technical

(1 Question)

Q1. Basics and Optimization techniques in Spark

Ans.

Spark basics include RDDs, transformations, actions, and optimizations like caching and partitioning.

RDDs (Resilient Distributed Datasets) are the fundamental data structure in Spark
Transformations like map, filter, and reduceByKey are used to process data in RDDs
Actions like count, collect, and saveAsTextFile trigger execution of transformations
Optimization techniques include caching frequently accessed data and parti...

Answered by AI

Add your answer

Round 3 - One-on-one

(1 Question)

Q1. Roles and responsibilities in your project

Add your answer

Skills evaluated in this interview

Big Data Engineer Jobs at Coforge

View all

Cloud Big Data Engineer - Python/Hadoop/Hive (5-8 yrs)

5-8 Yrs

₹ 10-19.9 LPA

Top trending discussions

View All

Salary Discussions, Hike & Promotions

fathersahaab

works at

AmbitionBox

New Job, Higher Pay, Now I’m Feeling Awkward

I’ve been at my new job for about six months now, and everything’s been going great! I’m getting positive feedback from my manager, and I get along well with the team. The thing is, when I started, my salary offer ended up being much higher than the initial number discussed during my interview. I didn’t negotiate and just accepted the offer. So, fast forward to happy hour with the team, and the topic of salary comes up. I, unfortunately, shared what I’m making, and let’s just say... it didn’t sit well with the others who have been on the team for years and make less than me. They weren’t mad at me, but now I’m feeling a bit uncomfortable and unsure how to handle this situation. Has anyone had something like this happen? How did you deal with it? Let’s chat!

Got a question about Coforge?

Ask anonymously on communities.

Interview questions from similar companies

Big Data Engineer Interview Questions & Answers

Virtusa Consulting Services

Anonymous

posted on 14 Dec 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I was interviewed in Nov 2024.

Round 1 - One-on-one

(7 Questions)

Q1. Command to check disk utilisation and health in Hadoop

Ans.

Use 'hdfs diskbalancer' command to check disk utilisation and health in Hadoop

Run 'hdfs diskbalancer -report' to get a report on disk utilisation
Use 'hdfs diskbalancer -plan <path>' to generate a plan for balancing disk usage
Check the Hadoop logs for any disk health issues

Answered by AI

Add your answer

Q2. Spark Architecture & the significance of each member of spark Architecture

Ans.

Spark Architecture consists of Driver, Cluster Manager, and Executors. Driver manages the execution of Spark jobs.

Driver: Manages the execution of Spark jobs, converts user code into tasks, and coordinates with Cluster Manager.
Cluster Manager: Manages resources across the cluster and allocates resources to Spark applications.
Executors: Execute tasks assigned by the Driver and store data in memory or disk for further pr...

Answered by AI

Add your answer

Q3. Partitioning and bucketing

Add your answer

Q4. Spark optimization techniques

Ans.

Optimization techniques in Spark improve performance and efficiency of data processing.

Partitioning data to distribute workload evenly
Caching frequently accessed data in memory
Using broadcast variables for small lookup tables
Avoiding shuffling operations whenever possible
Tuning memory settings and garbage collection parameters

Answered by AI

Add your answer

Q5. Second highest salary

Ans.

I am unable to provide this information as it is confidential.

Confidential information about salaries in previous organizations should not be disclosed.
It is important to respect the privacy and confidentiality of past employers.
Discussing specific salary details may not be appropriate in a professional setting.

Answered by AI

Add your answer

Q6. Pivot table creation in SQL from not pivot one

Ans.

To create a pivot table in SQL from a non-pivot table, you can use the CASE statement with aggregate functions.

Use the CASE statement to categorize data into columns
Apply aggregate functions like SUM, COUNT, AVG, etc. to calculate values for each category
Group the data by the columns you want to pivot on

Answered by AI

Add your answer

Q7. How to create triggers

Ans.

Creating triggers in a database involves defining the trigger, specifying the event that will activate it, and writing the code to be executed.

Define the trigger using the CREATE TRIGGER statement
Specify the event that will activate the trigger (e.g. INSERT, UPDATE, DELETE)
Write the code or actions to be executed when the trigger is activated
Test the trigger to ensure it functions as intended

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Easy to medium questions were asked.
They are focusing on concept basically

Skills evaluated in this interview

Big Data Engineer Interview Questions & Answers

LTIMindtree

Anonymous

posted on 21 Oct 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

I applied via Naukri.com and was interviewed in Sep 2024. There were 2 interview rounds.

Round 1 - Coding Test

It was WeCP based test

Round 2 - Technical

(2 Questions)

Q1. Explain Spark Architecture in detail

Ans.

Spark Architecture is a distributed computing framework that provides high-level APIs for in-memory computing.

Spark Architecture consists of a cluster manager, worker nodes, and a driver program.
It uses Resilient Distributed Datasets (RDDs) for fault-tolerant distributed data processing.
Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object.
It supports various data ...

Answered by AI

Add your answer

Q2. Explain higher order function, closure, anonymous function, map, flatmap, tail recursion

Ans.

Higher order functions, closures, anonymous functions, map, flatmap, and tail recursion are key concepts in functional programming.

Higher order function: Functions that can take other functions as arguments or return functions as results.
Closure: Functions that capture variables from their lexical scope, even when they are called outside that scope.
Anonymous function: Functions without a specified name, often used as a...

Answered by AI

Add your answer

Interview Preparation Tips

Topics to prepare for LTIMindtree Big Data Engineer interview:

SCALA
Spark
SQL

Skills evaluated in this interview

Big Data Engineer Interview Questions & Answers

Nagarro

Anonymous

posted on 16 Jan 2025

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

I applied via Referral and was interviewed in Dec 2024. There were 2 interview rounds.

Round 1 - Aptitude Test

30 Questions in 20 Minutes

Round 2 - Technical

(1 Question)

Q1. Baiscs of SQL,Python,AWS and spark in depth question

Add your answer

Big Data Engineer Interview Questions & Answers

Hexaware Technologies

Anonymous

posted on 11 Jun 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Job Portal and was interviewed in May 2024. There was 1 interview round.

Round 1 - Technical

(1 Question)

Q1. Explain pyspark architecture

Ans.

PySpark architecture is based on the Apache Spark architecture, with additional components for Python integration.

PySpark architecture includes Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX.
It allows Python developers to interact with Spark using PySpark API.
PySpark architecture enables distributed processing of large datasets using RDDs and DataFrames.
It leverages the power of in-memory processing for fast...

Answered by AI

Add your answer

Skills evaluated in this interview

Big Data Engineer Interview Questions & Answers

NTT Data

Anonymous

posted on 22 Aug 2023

Interview experience

Bad

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Jul 2023. There were 2 interview rounds.

Round 1 - Resume Shortlist

Pro Tip by AmbitionBox:

Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.

View all tips

Round 2 - Technical

(2 Questions)

Q1. Basic Questions of Scala Functional Programming concepts.

Add your answer

Q2. Spark internal working and optimization techniques

Ans.

Spark internal working and optimization techniques

Spark uses Directed Acyclic Graph (DAG) for optimizing workflows
Lazy evaluation helps in optimizing transformations by combining them into a single stage
Caching and persistence of intermediate results can improve performance
Partitioning data can help in parallel processing and reducing shuffle operations

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - The interview call was abruptly terminated into 20 mins of call duration, as the HR had another conflicting call. The HR called me over cellphone and told that if the interview panel requested she will let me know and the call can be extended, but the HR did not call, The interview did not extended. Finally they rejected me just after the panel spoke 20 mins with me in the 1st round interview. This shows how unprofessional are they in scheduling an interview call and how could any panel can decide within 20 mins of a discussion. Definitely not recommending anyone to attend Bigdata Engineering interviews here.

Skills evaluated in this interview

Senior Data Engineer Interview Questions & Answers

Publicis Sapient

Anonymous

posted on 15 Jan 2025

Interview experience

Average

Difficulty level

Process Duration

Result

Round 1 - Aptitude Test

The aptitude test lasts 30 minutes and focuses on topics relevant to data engineering, including Spark, SQL, Azure, and PySpark.

Round 2 - Coding Test

The coding test is a one-hour examination on PySpark.

Round 3 - Technical

(3 Questions)

Q1. What is the difference between Cache() and Persist()?

Add your answer

Q2. What does the purpose of the Spark Submit command in Apache Spark?

Add your answer

Q3. What are window functions in SQL?

Add your answer

Round 4 - HR

(2 Questions)

Q1. Could you provide more details about the daily responsibilities associated with this role?

Add your answer

Q2. How would you describe your work culture?

Add your answer

Data Analyst Interview Questions & Answers

ITC Infotech

Anonymous

posted on 28 Dec 2024

Interview experience

Excellent

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Selected

I applied via AmbitionBox and was interviewed in Nov 2024. There were 4 interview rounds.

Round 1 - HR

(2 Questions)

Q1. About your self

Add your answer

Q2. Communication skills

Add your answer

Round 2 - Technical

(3 Questions)

Q1. Programming language

Add your answer

Q2. What tools do you utilize for data analysis?

Ans.

I utilize tools such as Excel, Python, SQL, and Tableau for data analysis.

Excel for basic data manipulation and visualization
Python for advanced data analysis and machine learning
SQL for querying databases
Tableau for creating interactive visualizations

Answered by AI

Add your answer

Q3. Pandas numpy seaborn matplot

Add your answer

Round 3 - Coding Test

Data analysis of code in the context of data analysis.

Round 4 - Aptitude Test

Coding logical question paper.

Senior Data Engineer Interview Questions & Answers

Persistent Systems

Anonymous

posted on 17 Jul 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Naukri.com and was interviewed in Aug 2024. There were 2 interview rounds.

Round 1 - Technical

(12 Questions)

Q1. Tell me about yourself and Project

Ans.

I am a Senior Data Engineer with experience in developing data pipelines and optimizing data storage for various projects.

Developed data pipelines using Apache Spark for real-time data processing
Optimized data storage using technologies like Hadoop and AWS S3
Worked on a project to analyze customer behavior and improve marketing strategies

Answered by AI

Add your answer

Q2. What was you day-to-day job in your project

Ans.

My day-to-day job in the project involved designing and implementing data pipelines, optimizing data workflows, and collaborating with cross-functional teams.

Designing and implementing data pipelines to extract, transform, and load data from various sources
Optimizing data workflows to improve efficiency and performance
Collaborating with cross-functional teams including data scientists, analysts, and business stakeholde...

Answered by AI

Add your answer

Q3. Spark Architecture

Add your answer

Q4. How DAG handle Fault tolerance?

Ans.

DAGs handle fault tolerance by rerunning failed tasks and maintaining task dependencies.

DAGs rerun failed tasks automatically to ensure completion.
DAGs maintain task dependencies to ensure proper sequencing.
DAGs can be configured to retry failed tasks a certain number of times before marking them as failed.

Answered by AI

Add your answer

Q5. What is shuffling? How to Handle Shuffling?

Ans.

Shuffling is the process of redistributing data across partitions in a distributed computing environment.

Shuffling is necessary when data needs to be grouped or aggregated across different partitions.
It can be handled efficiently by minimizing the amount of data being shuffled and optimizing the partitioning strategy.
Techniques like partitioning, combiners, and reducers can help reduce the amount of shuffling in MapRed

Answered by AI

Add your answer

Q6. What is the difference between repartition and Coelsce?

Ans.

Repartition increases or decreases the number of partitions in a DataFrame, while Coalesce only decreases the number of partitions.

Repartition can increase or decrease the number of partitions in a DataFrame, leading to a shuffle of data across the cluster.
Coalesce only decreases the number of partitions in a DataFrame without performing a full shuffle, making it more efficient than repartition.
Repartition is typically...

Answered by AI

Add your answer

Q7. How do you handle Incremental data?

Ans.

Incremental data is handled by identifying new data since the last update and merging it with existing data.

Identify new data since last update
Merge new data with existing data
Update data warehouse or database with incremental changes

Answered by AI

Add your answer

Q8. What is SCD ??

Ans.

SCD stands for Slowly Changing Dimension, a concept in data warehousing to track changes in data over time.

SCD is used to maintain historical data in a data warehouse.
There are three types of SCD - Type 1, Type 2, and Type 3.
Type 1 SCD overwrites old data with new data.
Type 2 SCD creates a new record for each change, preserving history.
Type 3 SCD maintains both old and new values in the same record.
SCD is important for...

Answered by AI

Add your answer

Q9. Scenerio based questions related to Spark ?

Add your answer

Q10. Two SQL Codes and Two Python codes like reverse a string ?

Ans.

Reverse a string using SQL and Python codes.

In SQL, use the REVERSE function to reverse a string.
In Python, use slicing with a step of -1 to reverse a string.

Answered by AI

Add your answer

Q11. Find top 5 countries with highest population in Spark and SQL

Ans.

Use Spark and SQL to find the top 5 countries with the highest population.

Use Spark to load the data and perform data processing.
Use SQL queries to group by country and sum the population.
Order the results in descending order and limit to top 5.
Example: SELECT country, SUM(population) AS total_population FROM table_name GROUP BY country ORDER BY total_population DESC LIMIT 5

Answered by AI

Add your answer

Q12. Using two tables find the different records for different joins

Ans.

To find different records for different joins using two tables

Use the SQL query to perform different joins like INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN
Identify the key columns in both tables to join on
Select the columns from both tables and use WHERE clause to filter out the different records

Answered by AI

Add your answer

Round 2 - One-on-one

(7 Questions)

Q1. What is a catalyst optimiser? How it works?

Ans.

A catalyst optimizer is a query optimization tool used in Apache Spark to improve performance by generating an optimal query plan.

Catalyst optimizer is a rule-based query optimization framework in Apache Spark.
It leverages rules to transform the logical query plan into a more optimized physical plan.
The optimizer applies various optimization techniques like predicate pushdown, constant folding, and join reordering.
By o...

Answered by AI

Add your answer

Q2. Tell me about the optimization you used in your project.

Ans.

Used query optimization techniques to improve performance in database queries.

Utilized indexing to speed up search queries.
Implemented query caching to reduce redundant database calls.
Optimized SQL queries by restructuring joins and subqueries.
Utilized database partitioning to improve query performance.
Used query profiling tools to identify and optimize slow queries.

Answered by AI

Add your answer

Q3. Pyspark question related to merging two schemas?

Add your answer

Q4. What is the best approach to finding whether the data frame is empty or not?

Ans.

Use the len() function to check the length of the data frame.

Use len() function to get the number of rows in the data frame.
If the length is 0, then the data frame is empty.
Example: if len(df) == 0: print('Data frame is empty')

Answered by AI

Add your answer

Q5. Spark Architecture

Add your answer

Q6. How do you decide on cores and worker nodes?

Ans.

Cores and worker nodes are decided based on the workload requirements and scalability needs of the data processing system.

Consider the size and complexity of the data being processed
Evaluate the processing speed and memory requirements of the tasks
Take into account the parallelism and concurrency needed for efficient data processing
Monitor the system performance and adjust cores and worker nodes as needed

Answered by AI

Add your answer

Q7. What happens when we enforce schema ?

Ans.

Enforcing schema ensures that data conforms to a predefined structure and rules.

Ensures data integrity by validating incoming data against predefined schema
Helps in maintaining consistency and accuracy of data
Prevents data corruption and errors in data processing
Can lead to rejection of data that does not adhere to the schema

Answered by AI

Add your answer

Interview Preparation Tips

Topics to prepare for Persistent Systems Senior Data Engineer interview:

SQL
Pyspark
Python
Spark
Database

Interview preparation tips for other job seekers - Be prepared with Spark core concepts and SQL Coding

Skills evaluated in this interview

Coforge Interview FAQs

How many rounds are there in Coforge Big Data Engineer interview?

Coforge interview process usually has 3-4 rounds. The most common rounds in the Coforge interview process are Technical, Resume Shortlist and HR.

How to prepare for Coforge Big Data Engineer interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Coforge. The most common topics and skills that interviewers at Coforge expect are Big Data, Hive, Spark, Clinical Data Management and Cloud Computing.

What are the top questions asked in Coforge Big Data Engineer interview?

Some of the top questions asked at the Coforge Big Data Engineer interview -

Basics and Optimization techniques in Sp...read more
Project overview and spark architecture questions,scala coding questi...read more
Project overview Scala and AWS services questi...read more

Tell us how to improve this page.

Coforge Interviews By Designations

Interview Questions for Popular Designations

Coforge Big Data Engineer Interview Process

based on 2 interviews

Interview experience

Good

TCS Big Data Engineer Interview Questions

3.7

• 7 Interviews

Infosys Big Data Engineer Interview Questions

3.7

• 5 Interviews

Wipro Big Data Engineer Interview Questions

3.7

• 3 Interviews

Virtusa Consulting Services Big Data Engineer Interview Questions

3.8

• 2 Interviews

HCLTech Big Data Engineer Interview Questions

3.5

• 1 Interview

Tech Mahindra Big Data Engineer Interview Questions

3.5

• 1 Interview

LTIMindtree Big Data Engineer Interview Questions

3.8

• 1 Interview

Hexaware Technologies Big Data Engineer Interview Questions

3.6

• 1 Interview

Nagarro Big Data Engineer Interview Questions

4.0

• 1 Interview

View all

KIIT University, Bhuvaneshwar Placement Questions

2 Interviews

IIMT College of Engineering, Noida Placement Questions

1 Interview

IPS Academy, Indore Placement Questions

1 Interview

J S S Academy of Technical Education, Bangalore Placement Questions

1 Interview

Krishna Institute of Engineering and Technology, Ghaziabad Placement Questions

1 Interview

Manav Rachna College of Engineering, Faridabad Placement Questions

1 Interview

Meerut Institute of Engineering and Technology, Meerut Placement Questions

1 Interview

View all

Coforge Big Data Engineer Salary

based on 11 salaries

₹6.9 L/yr - ₹22 L/yr

49% more than the average Big Data Engineer Salary in India

View more details

Big Data Engineer Jobs at Coforge

Cloud Big Data Engineer - Python/Hadoop/Hive (5-8 yrs)

5-8 Yrs

₹ 10-19.9 LPA

Explore more jobs

Coforge Salaries in India

Senior Software Engineer 4.9k salaries	₹6.3 L/yr - ₹26 L/yr
Technical Analyst 2.6k salaries	₹9.4 L/yr - ₹38.4 L/yr
Software Engineer 2k salaries	₹2.2 L/yr - ₹9.5 L/yr
Senior Test Engineer 1.8k salaries	₹4.7 L/yr - ₹19.3 L/yr
Technology Specialist 1.2k salaries	₹11.8 L/yr - ₹42 L/yr