i
LTIMindtree
Proud winner of ABECA 2025 - AmbitionBox Employee Choice Awards
Filter interviews by
Joining two tables in SQL combines rows based on a related column, enabling comprehensive data analysis.
Use INNER JOIN to return only matching rows from both tables. Example: SELECT * FROM TableA INNER JOIN TableB ON TableA.id = TableB.a_id;
LEFT JOIN returns all rows from the left table and matched rows from the right table. Example: SELECT * FROM TableA LEFT JOIN TableB ON TableA.id = TableB.a_id;
RIGHT JOIN retur...
Performance optimization in Spark involves tuning configurations, optimizing code, and utilizing caching.
Tune Spark configurations such as executor memory, cores, and parallelism
Optimize code by reducing unnecessary shuffles, using efficient transformations, and avoiding unnecessary data movements
Utilize caching to store intermediate results in memory for faster access
Read and write in data engineering refers to the processes of data ingestion and data output in systems.
Reading data involves extracting information from various sources like databases, APIs, or files.
Writing data refers to storing processed or raw data into databases, data lakes, or other storage solutions.
Example of reading: Using SQL queries to fetch data from a relational database.
Example of writing: Using ETL...
Lambda is a serverless computing service that runs code in response to events without provisioning servers.
Lambda allows you to run code in response to triggers such as HTTP requests via API Gateway.
It supports multiple programming languages, including Python, Node.js, and Java.
You only pay for the compute time you consume, making it cost-effective for variable workloads.
Lambda can be integrated with other AWS ser...
What people are saying about LTIMindtree
Spark cluster is a group of interconnected computers that work together to process large datasets using Apache Spark.
Consists of a master node and multiple worker nodes
Master node manages the distribution of tasks and resources
Worker nodes execute the tasks in parallel
Used for processing big data and running distributed computing jobs
Hive is a data warehouse system built on top of Hadoop for querying and analyzing large datasets stored in HDFS.
Hive translates SQL-like queries into MapReduce jobs to process data stored in HDFS
It uses a metastore to store metadata about tables and partitions
HiveQL is the query language used in Hive, similar to SQL
Hive supports partitioning, bucketing, and indexing for optimizing queries
Dense rank in SQL assigns a unique rank to each distinct row in a result set, with no gaps between the ranks.
Dense rank is used to assign a rank to each row in a result set without any gaps.
It differs from regular rank in that it does not skip ranks if there are ties.
For example, if two rows have the same value and are ranked 1st, the next row will be ranked 2nd, not 3rd.
Pyspark is a Python API for Apache Spark, a powerful open-source distributed computing system.
Pyspark allows users to write Spark applications using Python programming language.
It provides high-level APIs in Python for Spark's core functionality.
Pyspark can be used for processing large datasets in a distributed computing environment.
Example: Using Pyspark to perform data analysis and machine learning tasks on big ...
Use the withColumn method in PySpark to combine two columns in a DataFrame.
Use the withColumn method to create a new column by combining two existing columns
Specify the new column name and the expression to combine the two columns
Example: df = df.withColumn('combined_column', concat(col('column1'), lit(' '), col('column2')))
Spark architecture includes driver, executor, and cluster manager components for distributed data processing.
Spark architecture consists of a driver program that manages the execution of tasks across multiple worker nodes.
Executors are responsible for executing tasks on worker nodes and storing data in memory or disk.
Cluster manager is used to allocate resources and schedule tasks across the cluster.
Spark applicat...
Use SQL query with MAX function to find the highest salary in a table.
Use SELECT MAX(salary) FROM table_name;
Make sure to replace 'salary' with the actual column name in the table.
Ensure proper permissions to access the table.
Dense rank in SQL assigns a unique rank to each distinct row in a result set, with no gaps between the ranks.
Dense rank is used to assign a rank to each row in a result set without any gaps.
It differs from regular rank in that it does not skip ranks if there are ties.
For example, if two rows have the same value and are ranked 1st, the next row will be ranked 2nd, not 3rd.
Spark cluster is a group of interconnected computers that work together to process large datasets using Apache Spark.
Consists of a master node and multiple worker nodes
Master node manages the distribution of tasks and resources
Worker nodes execute the tasks in parallel
Used for processing big data and running distributed computing jobs
Hive is a data warehouse system built on top of Hadoop for querying and analyzing large datasets stored in HDFS.
Hive translates SQL-like queries into MapReduce jobs to process data stored in HDFS
It uses a metastore to store metadata about tables and partitions
HiveQL is the query language used in Hive, similar to SQL
Hive supports partitioning, bucketing, and indexing for optimizing queries
I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.
Codility test on SQL, Spark and Python
Technical interviewer asked on Spark, Python and SQL
I will bring strong programming skills, experience with big data technologies, and a deep understanding of data processing and analysis.
Strong programming skills in languages like Python, Java, or Scala
Experience with big data technologies such as Hadoop, Spark, and Kafka
Deep understanding of data processing and analysis techniques
Ability to design and implement scalable data pipelines
Experience with cloud platforms li...
I am a Senior Data Engineer with extensive experience in data architecture, ETL processes, and big data technologies.
Data Architecture: I have designed scalable data architectures for various organizations, ensuring efficient data storage and retrieval.
ETL Processes: Developed robust ETL pipelines using tools like Apache NiFi and Talend, which improved data processing times by 30%.
Big Data Technologies: Proficient in u...
I applied via Naukri.com and was interviewed in Jul 2024. There was 1 interview round.
Handle missing data in pyspark dataframe by using functions like dropna, fillna, or replace.
Use dropna() function to remove rows with missing data
Use fillna() function to fill missing values with a specified value
Use replace() function to replace missing values with a specified value
Use PySpark to find the maximum salary paid to customers in each department from a given table.
Load the data into a DataFrame using spark.read.
Group the data by department using groupBy() method.
Use agg() function to calculate the maximum salary for each department.
Example: df.groupBy('department').agg({'salary': 'max'})
Show the results using show() method.
Joining two tables in SQL combines rows based on a related column, enabling comprehensive data analysis.
Use INNER JOIN to return only matching rows from both tables. Example: SELECT * FROM TableA INNER JOIN TableB ON TableA.id = TableB.a_id;
LEFT JOIN returns all rows from the left table and matched rows from the right table. Example: SELECT * FROM TableA LEFT JOIN TableB ON TableA.id = TableB.a_id;
RIGHT JOIN returns al...
I applied via Approached by Company
Lambda is a serverless computing service that runs code in response to events without provisioning servers.
Lambda allows you to run code in response to triggers such as HTTP requests via API Gateway.
It supports multiple programming languages, including Python, Node.js, and Java.
You only pay for the compute time you consume, making it cost-effective for variable workloads.
Lambda can be integrated with other AWS services...
Read and write in data engineering refers to the processes of data ingestion and data output in systems.
Reading data involves extracting information from various sources like databases, APIs, or files.
Writing data refers to storing processed or raw data into databases, data lakes, or other storage solutions.
Example of reading: Using SQL queries to fetch data from a relational database.
Example of writing: Using ETL tool...
Aptitude test and 2 coding questions
Object-oriented programming concepts in Java
Encapsulation: bundling data and methods that operate on the data into a single unit
Inheritance: allows a class to inherit properties and behavior from another class
Polymorphism: ability of a method to do different things based on the object it is acting upon
Abstraction: hiding the implementation details and showing only the functionality to the user
My best friends are like family to me, always there to support and uplift me.
They are always there for me in good times and bad.
We share common interests and hobbies.
They understand me and accept me for who I am.
We have created many unforgettable memories together.
I can always count on them to give me honest advice and feedback.
Mindtree is an Indian multinational information technology and outsourcing company.
Founded in 1999
Headquartered in Bangalore, India
Provides IT services and consulting
Acquired by L&T in 2019
I applied via Recruitment Consulltant and was interviewed in Nov 2024. There was 1 interview round.
DWT concept based questions, even though questions were perefectly answered, and the interviewer gave a promise note, no call back from them
Spark and Big data related questions
The duration of LTIMindtree Senior Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 68 interview experiences
Difficulty level
Duration
based on 404 reviews
Rating in categories
Senior Software Engineer
22k
salaries
| ₹7.4 L/yr - ₹21.6 L/yr |
Software Engineer
16.3k
salaries
| ₹3.9 L/yr - ₹8.8 L/yr |
Technical Lead
6.4k
salaries
| ₹16.4 L/yr - ₹28.3 L/yr |
Module Lead
5.7k
salaries
| ₹11.8 L/yr - ₹20.4 L/yr |
Senior Engineer
4.4k
salaries
| ₹5.8 L/yr - ₹14 L/yr |
Cognizant
Capgemini
Accenture
TCS