Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Engaged Employer

Sigmoid

Compare

3.4

based on 148 Reviews

Video summary

Filter interviews by

Sigmoid Senior Data Engineer Interview Questions and Answers

Updated 20 Feb 2025

10 Interview questions

A Senior Data Engineer was asked 5mo ago

Q. Given a non-decreasing array, how can I determine the indices of an element X within it? If the element is not present, the output should be [-1, -1]. For example, for the array [1,2,3,3,5,5,7,8] and X=5, t...

Ans.

Find indices of an element in a non-decreasing array

Iterate through the array and keep track of the indices where the element X is found
Return the list of indices or [-1, -1] if element X is not found
Handle edge cases like empty array or X not present in the array

A Senior Data Engineer was asked 8mo ago

Q. How does schema inference work in PySpark when reading a file?

Ans.

inferschema in pyspark is used to automatically infer the schema of a file when reading it.

inferschema is a parameter in pyspark that can be set to true when reading a file to automatically infer the schema based on the data
It is useful when the schema of the file is not known beforehand
Example: df = spark.read.csv('file.csv', header=True, inferSchema=True)

A Senior Data Engineer was asked 8mo ago

Q. What is SCD in data warehousing?

Ans.

SCD stands for Slowly Changing Dimension in Data Warehousing.

SCD is a technique used in data warehousing to track changes to dimension data over time.
There are different types of SCDs - Type 1, Type 2, and Type 3.
Type 1 SCD overwrites old data with new data, Type 2 creates new records for changes, and Type 3 maintains both old and new values in separate columns.
Example: In a customer dimension table, if a customer...

A Senior Data Engineer was asked 8mo ago

Q. What are the differences between repartition and coalesce?

Ans.

Repartition is used to increase the number of partitions in a DataFrame, while coalesce is used to decrease the number of partitions.

Repartition involves shuffling data across the network, which can be expensive in terms of performance and resources.
Coalesce is a more efficient operation as it minimizes data movement by only merging existing partitions.
Repartition is typically used when there is a need for more pa...

A Senior Data Engineer was asked 8mo ago

Q. What is the difference between RANK and DENSE_RANK?

Ans.

Rank assigns unique ranks to each distinct value, while dense rank assigns ranks without gaps.

Rank function assigns unique ranks to each distinct value in a result set.
Dense rank function assigns ranks to rows in a result set without any gaps between the ranks.
Rank function may skip ranks if there are ties in values, while dense rank will not skip ranks.

A Senior Data Engineer was asked 8mo ago

Q. How do you read a Parquet file and remove duplicates?

Ans.

Read a parquet file using PySpark and remove duplicates based on specified columns.

Use PySpark to read the parquet file: `df = spark.read.parquet('path/to/file.parquet')`.
Identify the columns to check for duplicates, e.g., `['column1', 'column2']`.
Use the `dropDuplicates()` method: `df_unique = df.dropDuplicates(['column1', 'column2'])`.
Write the cleaned DataFrame back to a parquet file: `df_unique.write.parquet('...

A Senior Data Engineer was asked 8mo ago

Q. Given some numbers, how many rows would result from an inner join, left join, right join, and full join?

Ans.

Understanding the number of rows returned by different types of SQL joins is crucial for data analysis.

Inner Join: Returns rows with matching values in both tables. Example: 10 rows from Table A and 15 from Table B may yield 5 rows.
Left Join: Returns all rows from the left table and matched rows from the right. Example: 10 rows from A and 5 matches in B yield 10 rows.
Right Join: Returns all rows from the right tab...

Are these interview questions helpful?

A Senior Data Engineer was asked 8mo ago

Q. Optimizing techniques in spark

Ans.

Optimizing techniques in Spark involve partitioning, caching, and tuning resources for efficient data processing.

Use partitioning to distribute data evenly across nodes for parallel processing
Cache frequently accessed data in memory to avoid recomputation
Tune resources such as memory allocation and parallelism settings for optimal performance

A Senior Data Engineer was asked 8mo ago

Q. Normalization in db and types

Ans.

Normalization in databases is the process of organizing data in a database to reduce redundancy and improve data integrity.

Normalization is used to eliminate redundant data and ensure data integrity.
It involves breaking down a table into smaller tables and defining relationships between them.
There are different normal forms such as 1NF, 2NF, 3NF, and BCNF.
Normalization helps in reducing data redundancy and improvi...

A Senior Data Engineer was asked 8mo ago

Q. Transformation vs action

Ans.

Transformation involves changing the data structure, while action involves performing a computation on the data.

Transformation changes the data structure without executing any computation
Action performs a computation on the data and triggers the execution
Examples of transformation include map, filter, and reduce in Spark or Pandas
Examples of action include count, collect, and saveAsTextFile in Spark

Sigmoid Senior Data Engineer Interview Experiences

3 interviews found

Senior Data Engineer Interview Questions & Answers

Anonymous

posted on 10 Jan 2025

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Technical

(2 Questions)

Q1. Given a non-decreasing array, how can I determine the indices of an element X within it? If the element is not present, the output should be [-1, -1]. For example, for the array [1,2,3,3,5,5,7,8] and X=5, ...

Ans.

Find indices of an element in a non-decreasing array

Iterate through the array and keep track of the indices where the element X is found
Return the list of indices or [-1, -1] if element X is not found
Handle edge cases like empty array or X not present in the array

Answered by AI

Add your answer

Q2. From the employee table, how can I return the employee_name, department, age, and salary of the employee who is older than 35 and has the highest salary within each department?

Add your answer

Senior Data Engineer Interview Questions & Answers

Anonymous

posted on 10 Oct 2024

Interview experience

Bad

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - Technical

(10 Questions)

Q1. What is scd in dw?

Ans.

SCD stands for Slowly Changing Dimension in Data Warehousing.

SCD is a technique used in data warehousing to track changes to dimension data over time.
There are different types of SCDs - Type 1, Type 2, and Type 3.
Type 1 SCD overwrites old data with new data, Type 2 creates new records for changes, and Type 3 maintains both old and new values in separate columns.
Example: In a customer dimension table, if a customer chan...

Answered by AI

Add your answer

Q2. Inferschema in pyspark when reading file

Ans.

inferschema in pyspark is used to automatically infer the schema of a file when reading it.

inferschema is a parameter in pyspark that can be set to true when reading a file to automatically infer the schema based on the data
It is useful when the schema of the file is not known beforehand
Example: df = spark.read.csv('file.csv', header=True, inferSchema=True)

Answered by AI

Add your answer

Q3. Triplet program

Add your answer

Q4. Rank vs dense rank

Ans.

Rank assigns unique ranks to each distinct value, while dense rank assigns ranks without gaps.

Rank function assigns unique ranks to each distinct value in a result set.
Dense rank function assigns ranks to rows in a result set without any gaps between the ranks.
Rank function may skip ranks if there are ties in values, while dense rank will not skip ranks.

Answered by AI

Add your answer

Q5. Some numbers gave how many rows for inner join,left,right,full join

Ans.

Understanding the number of rows returned by different types of SQL joins is crucial for data analysis.

Inner Join: Returns rows with matching values in both tables. Example: 10 rows from Table A and 15 from Table B may yield 5 rows.
Left Join: Returns all rows from the left table and matched rows from the right. Example: 10 rows from A and 5 matches in B yield 10 rows.
Right Join: Returns all rows from the right table an...

Answered by AI

Add your answer

Q6. Read parquet file and remove duplicates

Ans.

Read a parquet file using PySpark and remove duplicates based on specified columns.

Use PySpark to read the parquet file: `df = spark.read.parquet('path/to/file.parquet')`.
Identify the columns to check for duplicates, e.g., `['column1', 'column2']`.
Use the `dropDuplicates()` method: `df_unique = df.dropDuplicates(['column1', 'column2'])`.
Write the cleaned DataFrame back to a parquet file: `df_unique.write.parquet('path/...

Answered by AI

Add your answer

Q7. Optimizing techniques in spark

Ans.

Optimizing techniques in Spark involve partitioning, caching, and tuning resources for efficient data processing.

Use partitioning to distribute data evenly across nodes for parallel processing
Cache frequently accessed data in memory to avoid recomputation
Tune resources such as memory allocation and parallelism settings for optimal performance

Answered by AI

Add your answer

Q8. Repartition vs coalesce

Ans.

Repartition is used to increase the number of partitions in a DataFrame, while coalesce is used to decrease the number of partitions.

Repartition involves shuffling data across the network, which can be expensive in terms of performance and resources.
Coalesce is a more efficient operation as it minimizes data movement by only merging existing partitions.
Repartition is typically used when there is a need for more paralle...

Answered by AI

Add your answer

Q9. Normalization in db and types

Ans.

Normalization in databases is the process of organizing data in a database to reduce redundancy and improve data integrity.

Normalization is used to eliminate redundant data and ensure data integrity.
It involves breaking down a table into smaller tables and defining relationships between them.
There are different normal forms such as 1NF, 2NF, 3NF, and BCNF.
Normalization helps in reducing data redundancy and improving qu...

Answered by AI

Add your answer

Q10. Transformation vs action

Ans.

Transformation involves changing the data structure, while action involves performing a computation on the data.

Transformation changes the data structure without executing any computation
Action performs a computation on the data and triggers the execution
Examples of transformation include map, filter, and reduce in Spark or Pandas
Examples of action include count, collect, and saveAsTextFile in Spark

Answered by AI

Add your answer

Skills evaluated in this interview

Senior Data Engineer Interview Questions & Answers

Abhishek Kumar Pandey

posted on 20 Feb 2025

Interview experience

Poor

Difficulty level

Process Duration

Result

Round 1 - Coding Test

Share price related question.most profit

Top trending discussions

View All

Interview Tips & Stories

toobluntforu

works at

Cvent

Can speak English, can’t deliver in interviews

I feel like I can't speak fluently during interviews. I do know english well and use it daily to communicate, but the moment I'm in an interview, I just get stuck. since it's not my first language, I struggle to express what I actually feel. I know the answer in my head, but I just can’t deliver it properly at that moment. Please guide me

Got a question about Sigmoid?

Ask anonymously on communities.

Interview questions from similar companies

Software Engineer Interview Questions & Answers

Crisil

Anonymous

posted on 3 Jul 2022

I applied via Approached by Company and was interviewed before Jul 2021. There was 1 interview round.

Round 1 - Technical

(1 Question)

Q1. How would you write a REST API from scratch? Explain your role in the project.

Ans.

To write a REST API from scratch, I would follow these steps:

Define the resources and endpoints
Choose a programming language and framework
Implement CRUD operations for each resource
Use HTTP methods and status codes correctly
Add authentication and authorization
Test the API using tools like Postman
Document the API using tools like Swagger

Answered by AI

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - There are two rounds of technical and one hr.

Skills evaluated in this interview

Software Engineer Interview Questions & Answers

Indegene

Anonymous

posted on 26 Apr 2021

Interview Questionnaire

2 Questions

Q1. What do you know about our company?

Add your answer

Q2. Are you interested in higher Education?

Ans.

I value higher education for its role in personal growth and professional development in the tech industry.

Higher education provides in-depth knowledge in specialized areas, such as machine learning or cybersecurity.
It offers networking opportunities with peers and industry professionals, which can lead to collaborations and job opportunities.
Pursuing advanced degrees, like a Master's in Computer Science, can enhance c...

Answered by AI

Add your answer

Software Engineer Interview Questions & Answers

Indegene

MSA msa

posted on 27 Jun 2021

Interview Questionnaire

2 Questions

Q1. This is swap case

Add your answer

Q2. Str.swapcase()

Ans.

str.swapcase() returns a new string with all uppercase letters converted to lowercase and vice versa.

Usage: 'Hello World'.swapcase() returns 'hELLO wORLD'.
It affects only alphabetic characters; numbers and symbols remain unchanged.
This method does not modify the original string; it returns a new one.
Example: 'Python 3.8'.swapcase() results in 'pYTHON 3.8'.

Answered by AI

Add your answer

Software Engineer Interview Questions & Answers

Mu Sigma

Anonymous

posted on 11 Nov 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Technical

(1 Question)

Q1. What fifo? diff btwn fifo and lifo

Ans.

FIFO stands for First In, First Out. LIFO stands for Last In, First Out.

FIFO is a method for organizing and manipulating a data buffer, where the first element added is the first to be removed.
LIFO is a method where the last element added is the first to be removed.
FIFO is like a queue, while LIFO is like a stack.
Example: In a FIFO queue, if elements A, B, and C are added in that order, they will be removed in the same...

Answered by AI

Add your answer

Skills evaluated in this interview

Are these interview questions helpful?

Software Engineer Interview Questions & Answers

Kantar

Ganesh Rohit

posted on 18 Jan 2025

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Coding Test

First round was online coding round, second was coding in interview infornt of panel, third was DSA+basic Database questions

Software Engineer Interview Questions & Answers

Kantar

K VENKATA SAI DEEPAK

posted on 14 May 2024

Interview experience

Good

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Selected

I appeared for an interview in Apr 2024.

Round 1 - One-on-one

(2 Questions)

Q1. Basics of sql and some coding questions

Add your answer

Q2. Data structures

Add your answer

Round 2 - Technical

(1 Question)

Q1. Putin arrays strings

Add your answer

Software Engineer Interview Questions & Answers

Nielsen

Anonymous

posted on 17 Nov 2021

I applied via LinkedIn and was interviewed in Oct 2021. There were 4 interview rounds.

Interview Questionnaire

2 Questions

Q1. Basic to advanced JavaScript concepts, OOPS and data structures

Add your answer

Q2. Writing live code for the problems given during the interview rounds

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Basic to advanced knowledge is must and also practice some of the easy and medium level leetcode problems

Sigmoid Interview FAQs

How many rounds are there in Sigmoid Senior Data Engineer interview?

Sigmoid interview process usually has 1 rounds. The most common rounds in the Sigmoid interview process are Technical and Coding Test.

How to prepare for Sigmoid Senior Data Engineer interview?

Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Sigmoid. The most common topics and skills that interviewers at Sigmoid expect are Communication Skills, Data Modeling, ETL Testing, IELTS and IT Infrastructure.

What are the top questions asked in Sigmoid Senior Data Engineer interview?

Some of the top questions asked at the Sigmoid Senior Data Engineer interview -

Given a non-decreasing array, how can I determine the indices of an element X w...read more
some numbers gave how many rows for inner join,left,right,full j...read more
inferschema in pyspark when reading f...read more

Tell us how to improve this page.

Sigmoid Interviews By Designations

Interview Questions for Popular Designations

2.7/5

based on 3 interview experiences

Difficulty level

Moderate 100%

Duration

Less than 2 weeks 100%

Mu Sigma Interview Questions

2.6

• 240 Interviews

IKS Health Interview Questions

3.6

• 240 Interviews

Crisil Interview Questions

3.6

• 202 Interviews

CorroHealth infotech Interview Questions

3.2

• 162 Interviews

Acuity Knowledge Partners Interview Questions

3.3

• 159 Interviews

Indegene Interview Questions

3.4

• 158 Interviews

Nielsen Interview Questions

3.7

• 133 Interviews

Kantar Interview Questions

3.5

• 106 Interviews

Baroda Global Shared Services Interview Questions

4.0

• 104 Interviews

Netscribes Interview Questions

2.7

• 89 Interviews

View all

Sigmoid Senior Data Engineer Salary

based on 13 salaries

₹18 L/yr - ₹27.5 L/yr

At par with the average Senior Data Engineer Salary in India

View more details

Sigmoid Salaries in India

Software Development Engineer II 112 salaries	₹14 L/yr - ₹23 L/yr
Data Engineer 102 salaries	₹6.5 L/yr - ₹25 L/yr
Senior Data Scientist 68 salaries	₹17 L/yr - ₹28.5 L/yr
Data Scientist 63 salaries	₹8 L/yr - ₹24 L/yr
Senior Data Analyst 52 salaries	₹13.2 L/yr - ₹25 L/yr