Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 2K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

WINNERS AWAITED!
- ABECA 2025
  
  WINNERS AWAITED!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Engaged Employer

ValueLabs

Compare

3.7

based on 1.7k Reviews

Filter interviews by

ValueLabs Associate Data Engineer Interview Questions and Answers

Updated 15 May 2021

ValueLabs Associate Data Engineer Interview Experiences

1 interview found

Associate Data Engineer Interview Questions & Answers

Anonymous

posted on 15 May 2021

I applied via Naukri.com and was interviewed in Apr 2021. There were 3 interview rounds.

Interview Questionnaire

1 Question

Q1. Azure data factory databricks pyspark data warehouse synapse

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Good with four rounds

Top trending discussions

View All

Salary Discussions, Hike & Promotions

Hello guys, I've the below offers, (Teamware and Wipro not yet released the offers yet). I'm confused which one to choose. Could you give me some suggestions? Thanks. TCS : 13 LPA Teamware Solution: (Client: BNP Paribas) : 15 LPA Wipro: 16.5 LPA Tech: Cybersecurity, Vulnerability management, Crowdstrike, Risk Assessment. YOE: 7 yrs Location: Chennai

Got a question about ValueLabs?

Ask anonymously on communities.

Interview questions from similar companies

Data Analyst Interview Questions & Answers

ITC Infotech

Anonymous

posted on 28 Dec 2024

Interview experience

Excellent

Difficulty level

Easy

Process Duration

Less than 2 weeks

Result

Selected

I applied via AmbitionBox and was interviewed in Nov 2024. There were 4 interview rounds.

Round 1 - HR

(2 Questions)

Q1. About your self

Add your answer

Q2. Communication skills

Add your answer

Round 2 - Technical

(3 Questions)

Q1. Programming language

Add your answer

Q2. What tools do you utilize for data analysis?

Ans.

I utilize tools such as Excel, Python, SQL, and Tableau for data analysis.

Excel for basic data manipulation and visualization
Python for advanced data analysis and machine learning
SQL for querying databases
Tableau for creating interactive visualizations

Answered by AI

Add your answer

Q3. Pandas numpy seaborn matplot

Add your answer

Round 3 - Coding Test

Data analysis of code in the context of data analysis.

Round 4 - Aptitude Test

Coding logical question paper.

Senior Data Engineer Interview Questions & Answers

Persistent Systems

Anonymous

posted on 17 Jul 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Naukri.com and was interviewed in Aug 2024. There were 2 interview rounds.

Round 1 - Technical

(12 Questions)

Q1. Tell me about yourself and Project

Ans.

I am a Senior Data Engineer with experience in developing data pipelines and optimizing data storage for various projects.

Developed data pipelines using Apache Spark for real-time data processing
Optimized data storage using technologies like Hadoop and AWS S3
Worked on a project to analyze customer behavior and improve marketing strategies

Answered by AI

Add your answer

Q2. What was you day-to-day job in your project

Ans.

My day-to-day job in the project involved designing and implementing data pipelines, optimizing data workflows, and collaborating with cross-functional teams.

Designing and implementing data pipelines to extract, transform, and load data from various sources
Optimizing data workflows to improve efficiency and performance
Collaborating with cross-functional teams including data scientists, analysts, and business stakeholde...

Answered by AI

Add your answer

Q3. Spark Architecture

Add your answer

Q4. How DAG handle Fault tolerance?

Ans.

DAGs handle fault tolerance by rerunning failed tasks and maintaining task dependencies.

DAGs rerun failed tasks automatically to ensure completion.
DAGs maintain task dependencies to ensure proper sequencing.
DAGs can be configured to retry failed tasks a certain number of times before marking them as failed.

Answered by AI

Add your answer

Q5. What is shuffling? How to Handle Shuffling?

Ans.

Shuffling is the process of redistributing data across partitions in a distributed computing environment.

Shuffling is necessary when data needs to be grouped or aggregated across different partitions.
It can be handled efficiently by minimizing the amount of data being shuffled and optimizing the partitioning strategy.
Techniques like partitioning, combiners, and reducers can help reduce the amount of shuffling in MapRed

Answered by AI

Add your answer

Q6. What is the difference between repartition and Coelsce?

Ans.

Repartition increases or decreases the number of partitions in a DataFrame, while Coalesce only decreases the number of partitions.

Repartition can increase or decrease the number of partitions in a DataFrame, leading to a shuffle of data across the cluster.
Coalesce only decreases the number of partitions in a DataFrame without performing a full shuffle, making it more efficient than repartition.
Repartition is typically...

Answered by AI

Add your answer

Q7. How do you handle Incremental data?

Ans.

Incremental data is handled by identifying new data since the last update and merging it with existing data.

Identify new data since last update
Merge new data with existing data
Update data warehouse or database with incremental changes

Answered by AI

Add your answer

Q8. What is SCD ??

Ans.

SCD stands for Slowly Changing Dimension, a concept in data warehousing to track changes in data over time.

SCD is used to maintain historical data in a data warehouse.
There are three types of SCD - Type 1, Type 2, and Type 3.
Type 1 SCD overwrites old data with new data.
Type 2 SCD creates a new record for each change, preserving history.
Type 3 SCD maintains both old and new values in the same record.
SCD is important for...

Answered by AI

Add your answer

Q9. Scenerio based questions related to Spark ?

Add your answer

Q10. Two SQL Codes and Two Python codes like reverse a string ?

Ans.

Reverse a string using SQL and Python codes.

In SQL, use the REVERSE function to reverse a string.
In Python, use slicing with a step of -1 to reverse a string.

Answered by AI

Add your answer

Q11. Find top 5 countries with highest population in Spark and SQL

Ans.

Use Spark and SQL to find the top 5 countries with the highest population.

Use Spark to load the data and perform data processing.
Use SQL queries to group by country and sum the population.
Order the results in descending order and limit to top 5.
Example: SELECT country, SUM(population) AS total_population FROM table_name GROUP BY country ORDER BY total_population DESC LIMIT 5

Answered by AI

Add your answer

Q12. Using two tables find the different records for different joins

Ans.

To find different records for different joins using two tables

Use the SQL query to perform different joins like INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN
Identify the key columns in both tables to join on
Select the columns from both tables and use WHERE clause to filter out the different records

Answered by AI

Add your answer

Round 2 - One-on-one

(7 Questions)

Q1. What is a catalyst optimiser? How it works?

Ans.

A catalyst optimizer is a query optimization tool used in Apache Spark to improve performance by generating an optimal query plan.

Catalyst optimizer is a rule-based query optimization framework in Apache Spark.
It leverages rules to transform the logical query plan into a more optimized physical plan.
The optimizer applies various optimization techniques like predicate pushdown, constant folding, and join reordering.
By o...

Answered by AI

Add your answer

Q2. Tell me about the optimization you used in your project.

Ans.

Used query optimization techniques to improve performance in database queries.

Utilized indexing to speed up search queries.
Implemented query caching to reduce redundant database calls.
Optimized SQL queries by restructuring joins and subqueries.
Utilized database partitioning to improve query performance.
Used query profiling tools to identify and optimize slow queries.

Answered by AI

Add your answer

Q3. Pyspark question related to merging two schemas?

Add your answer

Q4. What is the best approach to finding whether the data frame is empty or not?

Ans.

Use the len() function to check the length of the data frame.

Use len() function to get the number of rows in the data frame.
If the length is 0, then the data frame is empty.
Example: if len(df) == 0: print('Data frame is empty')

Answered by AI

Add your answer

Q5. Spark Architecture

Add your answer

Q6. How do you decide on cores and worker nodes?

Ans.

Cores and worker nodes are decided based on the workload requirements and scalability needs of the data processing system.

Consider the size and complexity of the data being processed
Evaluate the processing speed and memory requirements of the tasks
Take into account the parallelism and concurrency needed for efficient data processing
Monitor the system performance and adjust cores and worker nodes as needed

Answered by AI

Add your answer

Q7. What happens when we enforce schema ?

Ans.

Enforcing schema ensures that data conforms to a predefined structure and rules.

Ensures data integrity by validating incoming data against predefined schema
Helps in maintaining consistency and accuracy of data
Prevents data corruption and errors in data processing
Can lead to rejection of data that does not adhere to the schema

Answered by AI

Add your answer

Interview Preparation Tips

Topics to prepare for Persistent Systems Senior Data Engineer interview:

SQL
Pyspark
Python
Spark
Database

Interview preparation tips for other job seekers - Be prepared with Spark core concepts and SQL Coding

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

Koch Business Solutions

Mugilarasan V

posted on 16 Nov 2024

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

6-8 weeks

Result

Selected

Round 1 - Technical

(1 Question)

Q1. More on Technical area

Add your answer

Round 2 - Technical

(1 Question)

Q1. More on Technical area

Add your answer

Round 3 - One-on-one

(1 Question)

Q1. Technical + Behaviour

Add your answer

Round 4 - One-on-one

(1 Question)

Q1. Technical + Behaviour

Add your answer

Round 5 - HR

(1 Question)

Q1. Expectation and Genaral

Add your answer

Data Analyst Interview Questions & Answers

L&T Technology Services

Anonymous

posted on 7 Jan 2025

Interview experience

Excellent

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Campus Placement and was interviewed in Dec 2024. There were 2 interview rounds.

Round 1 - Aptitude Test

Basics of mathematical ability and verbal ability

Round 2 - Technical

(2 Questions)

Q1. Introduction - explain projects

Add your answer

Q2. Data analytics explain

Add your answer

Data Engineer Interview Questions & Answers

Happiest Minds Technologies

Nagaraj Purohit

posted on 28 Aug 2024

Interview experience

Good

Difficulty level

Process Duration

Result

Round 1 - Technical

(5 Questions)

Q1. Data warehousing related questions

Add your answer

Q2. SQL scenario based questions

Add your answer

Q3. Project experience

Ans.

I have experience working on projects involving data pipeline development, ETL processes, and data warehousing.

Developed ETL processes to extract, transform, and load data from various sources into a data warehouse
Built data pipelines to automate the flow of data between systems and ensure data quality and consistency
Optimized database performance and implemented data modeling best practices
Worked on real-time data pro...

Answered by AI

Add your answer

Q4. Python Based questions

Add your answer

Q5. AWS features and questions

Add your answer

Round 2 - Technical

(2 Questions)

Q1. Similar to first round but in depth questions relatively

Add your answer

Q2. Asked about career goals and stuff

Add your answer

Round 3 - HR

(2 Questions)

Q1. General work related conversation

Add your answer

Q2. Salary discussion

Add your answer

Data Engineer Interview Questions & Answers

Coforge

Anonymous

posted on 6 Dec 2024

Interview experience

Average

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

Not Selected

I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.

Round 1 - Technical

(2 Questions)

Q1. Spark Architecture

Add your answer

Q2. Cache vs persist, lazy evaluation

Add your answer

Data Analyst Interview Questions & Answers

eClerx

Anonymous

posted on 22 May 2024

Interview experience

Good

Difficulty level

Moderate

Process Duration

Less than 2 weeks

Result

No response

I applied via Job Portal and was interviewed in Apr 2024. There were 2 interview rounds.

Round 1 - HR

(1 Question)

Q1. General discussion of background and technical interview availability.Expected Salary,current salary

Add your answer

Round 2 - Technical

(8 Questions)

Q1. Tell me about yourself.

Add your answer

Q2. Which project have you recently worked on?

Add your answer

Q3. How much experience do you have in python and R?

Ans.

I have 3 years of experience in Python and 2 years in R.

3 years of experience in Python, used for data cleaning, analysis, and visualization.
2 years of experience in R, utilized for statistical analysis and data modeling.
Proficient in using libraries like pandas, numpy, matplotlib in Python and ggplot2, dplyr in R.

Answered by AI

Add your answer

Q4. What project have you worked on in R programming?

Ans.

I have worked on a project in R programming analyzing customer churn for a telecommunications company.

Used R programming to clean and analyze customer data
Created visualizations to identify patterns and trends in customer behavior
Built predictive models to forecast customer churn rates
Collaborated with stakeholders to present findings and recommendations

Answered by AI

Add your answer

Q5. Which databases have you worked on?

Ans.

I have worked on databases such as MySQL, SQL Server, and MongoDB.

MySQL
SQL Server
MongoDB

Answered by AI

Add your answer

Q6. Given a list of strings, find the string matching a particular pattern.

Ans.

Find a string matching a specific pattern in a list of strings.

Use regular expressions to search for the pattern in each string.
Iterate through the list of strings and apply the pattern matching logic.
Return the string that matches the pattern, if found.
Example: List of strings - ['apple', 'banana', 'cherry'], pattern - 'ba' would return 'banana'.

Answered by AI

Add your answer

Q7. Given a list of dictionary, find the dictionary which has the count of key highest among all the dictionaries. eg: [{a:5},{b:2}.......] Now here 5 is the highest key value so it should be printed.

Ans.

Find the dictionary with the highest count of keys in a list of dictionaries.

Iterate through the list of dictionaries and keep track of the dictionary with the highest count of keys.
Use a loop to count the keys in each dictionary and compare it with the current highest count.
Return the dictionary with the highest count of keys.

Answered by AI

View 1 more answer

Q8. Do you have experience in converting python scripts to R?

Ans.

Yes, I have experience in converting Python scripts to R.

I have converted several Python scripts to R for data analysis projects.
I am proficient in both Python and R programming languages.
I can provide examples of projects where I have successfully converted Python scripts to R.

Answered by AI

Add your answer

Interview Preparation Tips

Topics to prepare for eClerx Data Analyst interview:

Python
R programming
Database
Lists
Dictionaries
Data Structures

Interview preparation tips for other job seekers - Prepare basics of Python and R.
Focus on accessing elements from different structures and performing basic operations on them.

Skills evaluated in this interview

Senior Data Engineer Interview Questions & Answers

Mphasis

Anonymous

posted on 31 Aug 2024

Interview experience

Excellent

Difficulty level

Process Duration

Result

Round 1 - Technical

(3 Questions)

Q1. ReduceByKey vs groupByKey

Ans.

reduceByKey is more efficient than groupByKey for aggregating data in Spark due to reduced shuffling.

reduceByKey combines values for each key in each partition before shuffling data
groupByKey shuffles all data to a single partition before combining values for each key
reduceByKey is preferred for large datasets to minimize data movement and improve performance

Answered by AI

Add your answer

Q2. Word count in scala

Ans.

Scala provides a simple way to count words in a string using built-in functions.

Use the split function to split the string into an array of words
Use the length function to get the count of words in the array

Answered by AI

Add your answer

Q3. Second highest salary SQL

Ans.

Use SQL query with ORDER BY and LIMIT to find the second highest salary.

Use ORDER BY clause to sort salaries in descending order
Use LIMIT 1,1 to skip the first highest salary and get the second highest salary

Answered by AI

Add your answer

Skills evaluated in this interview

Data Analyst Interview Questions & Answers

3i Infotech

Anonymous

posted on 3 Sep 2024

Interview experience

Poor

Difficulty level

Moderate

Process Duration

2-4 weeks

Result

Not Selected

I applied via Approached by Company and was interviewed in Mar 2024. There were 2 interview rounds.

Round 1 - Coding Test

Coding test for all the student attending the first round, give your best practice every day all the best in your interview

Round 2 - One-on-one

(2 Questions)

Q1. What are your interest

Ans.

My interests include data analysis, problem-solving, and continuous learning.

Data analysis
Problem-solving
Continuous learning

Answered by AI

Add your answer

Q2. Java related question

Add your answer

Interview Preparation Tips

Interview preparation tips for other job seekers - Give your best

Tell us how to improve this page.

ValueLabs Interviews By Designations

Interview Questions for Popular Designations

TCS Interview Questions

3.7

• 10.4k Interviews

Infosys Interview Questions

3.6

• 7.6k Interviews

Wipro Interview Questions

3.7

• 5.6k Interviews

Tech Mahindra Interview Questions

3.5

• 3.8k Interviews

HCLTech Interview Questions

3.5

• 3.8k Interviews

LTIMindtree Interview Questions

3.8

• 3k Interviews

Mphasis Interview Questions

3.4

• 801 Interviews

Hexaware Technologies Interview Questions

3.6

• 717 Interviews

Persistent Systems Interview Questions

3.5

• 610 Interviews

EPAM Systems Interview Questions

3.8

• 534 Interviews

View all

NRI Institute of Technology, Ranga Reddy Placement Questions

1 Interview

VR Siddharth Engineering College, Vijayawada Placement Questions

1 Interview

Anna University Placement Questions

1 Interview

SRM university (SRMU) Placement Questions

1 Interview

CVR College of Engineering, Hyderabad Placement Questions

1 Interview

Gandhi Institute of Technology and Management, Visakhapatnam Placement Questions

1 Interview

Kakatiya Institute of Technology and Science, Warangal Placement Questions

1 Interview

View all

ValueLabs Salaries in India

Senior Software Engineer 2.2k salaries	₹4.4 L/yr - ₹25 L/yr
Software Engineer 823 salaries	₹7.1 L/yr - ₹14 L/yr
Analyst 551 salaries	₹8.5 L/yr - ₹31 L/yr
Technical Lead 414 salaries	₹12 L/yr - ₹42 L/yr
System Analyst 386 salaries	₹9 L/yr - ₹34 L/yr