i
LTIMindtree
Proud winner of ABECA 2025 - AmbitionBox Employee Choice Awards
Filter interviews by
I admire LTIMindtree's innovative approach and commitment to data-driven solutions, making it an ideal place for my growth as a Data Engineer.
LTIMindtree's focus on cutting-edge technologies aligns with my passion for data engineering and analytics.
The company's diverse portfolio offers opportunities to work on various projects, enhancing my skills and experience.
I appreciate LTIMindtree's emphasis on collaboratio...
SparkContext is the main entry point for Spark functionality, while SparkSession is the entry point for Spark SQL.
SparkContext is the entry point for low-level API functionality in Spark.
SparkSession is the entry point for Spark SQL functionality.
SparkContext is used to create RDDs (Resilient Distributed Datasets) in Spark.
SparkSession provides a unified entry point for reading data from various sources and perfor...
When a spark job is submitted, various steps are executed at the backend to process the job.
The job is submitted to the Spark driver program.
The driver program communicates with the cluster manager to request resources.
The cluster manager allocates resources (CPU, memory) to the job.
The driver program creates DAG (Directed Acyclic Graph) of the job stages and tasks.
Tasks are then scheduled and executed on worker n...
Performance optimization in Spark involves tuning configurations, optimizing code, and utilizing caching.
Tune Spark configurations such as executor memory, number of executors, and shuffle partitions.
Optimize code by reducing unnecessary shuffles, using efficient transformations, and avoiding unnecessary data movements.
Utilize caching to store intermediate results in memory and avoid recomputation.
Example: In my p...
What people are saying about LTIMindtree
Optimizing SQL queries involves using indexes, avoiding unnecessary joins, and optimizing the query structure.
Use indexes on columns frequently used in WHERE clauses
Avoid using SELECT * and only retrieve necessary columns
Optimize joins by using INNER JOIN instead of OUTER JOIN when possible
Use EXPLAIN to analyze query performance and make necessary adjustments
Client mode is better for very less latency due to direct communication with the cluster.
Client mode allows direct communication with the cluster, reducing latency.
Standalone mode requires an additional layer of communication, increasing latency.
Client mode is preferred for real-time applications where low latency is crucial.
The two types of modes for Spark architecture are standalone mode and cluster mode.
Standalone mode: Spark runs on a single machine with a single JVM and is suitable for development and testing.
Cluster mode: Spark runs on a cluster of machines managed by a cluster manager like YARN or Mesos for production workloads.
SQL and PySpark code examples for data manipulation and analysis.
Use SQL for structured queries: SELECT, JOIN, GROUP BY.
Example SQL: SELECT name, COUNT(*) FROM patients GROUP BY name;
Use PySpark for big data processing: DataFrame API, RDDs.
Example PySpark: df.groupBy('name').count().show();
Optimize queries with indexing in SQL and caching in PySpark.
I have 5 years of experience working as a Data Engineer in various industries.
Developed ETL pipelines to extract, transform, and load data from multiple sources into a data warehouse
Optimized database performance by tuning queries and indexes
Implemented data quality checks to ensure accuracy and consistency of data
Worked with cross-functional teams to design and implement data solutions for business needs
Code to sort an array of strings
Use the built-in sort() function in the programming language of your choice
If case-insensitive sorting is required, use a custom comparator
Consider the time complexity of the sorting algorithm used
I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.
Optimizing SQL queries involves using indexes, avoiding unnecessary joins, and optimizing the query structure.
Use indexes on columns frequently used in WHERE clauses
Avoid using SELECT * and only retrieve necessary columns
Optimize joins by using INNER JOIN instead of OUTER JOIN when possible
Use EXPLAIN to analyze query performance and make necessary adjustments
Performance optimization in Spark involves tuning configurations, optimizing code, and utilizing caching.
Tune Spark configurations such as executor memory, number of executors, and shuffle partitions.
Optimize code by reducing unnecessary shuffles, using efficient transformations, and avoiding unnecessary data movements.
Utilize caching to store intermediate results in memory and avoid recomputation.
Example: In my projec...
SparkContext is the main entry point for Spark functionality, while SparkSession is the entry point for Spark SQL.
SparkContext is the entry point for low-level API functionality in Spark.
SparkSession is the entry point for Spark SQL functionality.
SparkContext is used to create RDDs (Resilient Distributed Datasets) in Spark.
SparkSession provides a unified entry point for reading data from various sources and performing ...
When a spark job is submitted, various steps are executed at the backend to process the job.
The job is submitted to the Spark driver program.
The driver program communicates with the cluster manager to request resources.
The cluster manager allocates resources (CPU, memory) to the job.
The driver program creates DAG (Directed Acyclic Graph) of the job stages and tasks.
Tasks are then scheduled and executed on worker nodes ...
Calculate second highest salary using SQL and pyspark
Use SQL query with ORDER BY and LIMIT to get the second highest salary
In pyspark, use orderBy() and take() functions to achieve the same result
The two types of modes for Spark architecture are standalone mode and cluster mode.
Standalone mode: Spark runs on a single machine with a single JVM and is suitable for development and testing.
Cluster mode: Spark runs on a cluster of machines managed by a cluster manager like YARN or Mesos for production workloads.
Client mode is better for very less latency due to direct communication with the cluster.
Client mode allows direct communication with the cluster, reducing latency.
Standalone mode requires an additional layer of communication, increasing latency.
Client mode is preferred for real-time applications where low latency is crucial.
SQL and PySpark code examples for data manipulation and analysis.
Use SQL for structured queries: SELECT, JOIN, GROUP BY.
Example SQL: SELECT name, COUNT(*) FROM patients GROUP BY name;
Use PySpark for big data processing: DataFrame API, RDDs.
Example PySpark: df.groupBy('name').count().show();
Optimize queries with indexing in SQL and caching in PySpark.
The first round included aptitude, coding, comprehension.
Code to sort an array of strings
Use the built-in sort() function in the programming language of your choice
If case-insensitive sorting is required, use a custom comparator
Consider the time complexity of the sorting algorithm used
I applied via Naukri.com and was interviewed in Sep 2023. There was 1 interview round.
I have 5 years of experience working as a Data Engineer in various industries.
Developed ETL pipelines to extract, transform, and load data from multiple sources into a data warehouse
Optimized database performance by tuning queries and indexes
Implemented data quality checks to ensure accuracy and consistency of data
Worked with cross-functional teams to design and implement data solutions for business needs
I applied via Campus Placement and was interviewed before Dec 2023. There were 3 interview rounds.
It was a basic aptitude test.
I am a data engineer with a strong background in programming and database management.
Experienced in designing and implementing data pipelines
Proficient in SQL, Python, and ETL tools
Skilled in data modeling and optimization
Worked on projects involving big data technologies like Hadoop and Spark
Factors to consider when designing a road curve
Radius of the curve
Speed limit of the road
Banking of the curve
Visibility around the curve
Traffic volume on the road
Road surface conditions
Presence of obstacles or hazards
Environmental factors such as weather conditions
Developed a real-time data processing system for analyzing customer behavior
Used Apache Kafka for real-time data streaming
Implemented data pipelines using Apache Spark for processing large volumes of data
Utilized machine learning algorithms to predict customer behavior
Designed and maintained data warehouse for storing and querying processed data
Experienced Data Engineer with a background in computer science and a passion for solving complex problems.
Bachelor's degree in Computer Science
Proficient in programming languages such as Python, SQL, and Java
Experience with big data technologies like Hadoop and Spark
Strong analytical and problem-solving skills
Worked on projects involving data pipelines, ETL processes, and data warehousing
My hobbies include hiking, photography, and playing the guitar.
Hiking: I enjoy exploring nature trails and challenging myself with different terrains.
Photography: I love capturing moments and landscapes through my camera lens.
Playing the guitar: I find relaxation and creativity in strumming chords and learning new songs.
My favorite movie is The Shawshank Redemption.
Directed by Frank Darabont
Based on a Stephen King novella
Themes of hope, friendship, and redemption
Critically acclaimed and considered one of the greatest films of all time
I appeared for an interview before Mar 2024, where I was asked the following questions.
I admire LTIMindtree's innovative approach and commitment to data-driven solutions, making it an ideal place for my growth as a Data Engineer.
LTIMindtree's focus on cutting-edge technologies aligns with my passion for data engineering and analytics.
The company's diverse portfolio offers opportunities to work on various projects, enhancing my skills and experience.
I appreciate LTIMindtree's emphasis on collaboration and...
All apti type questions
You have to make 4/5 min to crack this one
OOPs (Object-Oriented Programming) is a programming paradigm based on objects and classes, promoting code reusability and organization.
Encapsulation: Bundling data and methods that operate on the data within one unit (e.g., a class).
Inheritance: Mechanism to create a new class using properties and methods of an existing class (e.g., a 'Dog' class inheriting from an 'Animal' class).
Polymorphism: Ability to present the s...
I applied via Company Website and was interviewed before Feb 2023. There were 3 interview rounds.
Quite tough coding challenge
I applied via Company Website and was interviewed before Oct 2020. There were 3 interview rounds.
I applied via Company Website and was interviewed before Feb 2020. There was 1 interview round.
I applied via Job Portal and was interviewed before Dec 2019. There was 1 interview round.
Some of the top questions asked at the LTIMindtree Data Engineer interview for freshers -
The duration of LTIMindtree Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 6 interview experiences
Difficulty level
Duration
based on 375 reviews
Rating in categories
Senior Software Engineer
22k
salaries
| ₹7.4 L/yr - ₹21.6 L/yr |
Software Engineer
16.3k
salaries
| ₹3.9 L/yr - ₹8.8 L/yr |
Technical Lead
6.4k
salaries
| ₹16.4 L/yr - ₹28.3 L/yr |
Module Lead
5.7k
salaries
| ₹11.8 L/yr - ₹20.4 L/yr |
Senior Engineer
4.4k
salaries
| ₹5.8 L/yr - ₹14 L/yr |
Cognizant
Capgemini
Accenture
TCS