i
LTIMindtree
Proud winner of ABECA 2025 - AmbitionBox Employee Choice Awards
Filter interviews by
Indexes in databases help improve query performance by allowing faster data retrieval.
Types of indexes include clustered, non-clustered, unique, and composite indexes.
Clustered indexes physically reorder the data in the table based on the index key.
Non-clustered indexes create a separate structure that includes the indexed columns and a pointer to the actual data.
Unique indexes ensure that no two rows have the sam...
Performance optimization in Spark involves tuning configurations, optimizing code, and utilizing best practices.
Tune Spark configurations such as executor memory, number of executors, and shuffle partitions
Optimize code by reducing unnecessary shuffling, using efficient transformations, and caching intermediate results
Utilize best practices like using data partitioning, avoiding unnecessary data movements, and lev...
Spark memory management optimizes resource allocation for efficient data processing in distributed computing environments.
Spark uses a unified memory management model that divides memory into execution and storage regions.
The default memory fraction for execution is 60%, while 40% is allocated for storage, but these can be configured.
Spark employs a mechanism called 'Tungsten' for off-heap memory management, which...
Various optimisation techniques were used in my project to improve performance and efficiency.
Implemented indexing to speed up database queries
Utilized caching to reduce redundant data retrieval
Applied parallel processing to distribute workloads efficiently
Optimized algorithms to reduce time complexity
Used query optimization techniques to improve database performance
What people are saying about LTIMindtree
Handle incremental data by using tools like Apache Kafka for real-time data streaming and implementing CDC (Change Data Capture) for database updates.
Utilize tools like Apache Kafka for real-time data streaming
Implement CDC (Change Data Capture) for tracking database updates
Use data pipelines to process and integrate incremental data
Ensure data consistency and accuracy during incremental updates
Catalyst optimizer is a query optimization framework in Apache Spark that improves performance by applying various optimization techniques.
It is a query optimization framework in Apache Spark.
It improves performance by applying various optimization techniques.
It leverages techniques like predicate pushdown, column pruning, and constant folding to optimize queries.
Catalyst optimizer generates an optimized logical p...
Query to identify and delete duplicate records in SQL
Use a combination of SELECT and DELETE statements
Identify duplicates using GROUP BY and HAVING clauses
Delete duplicates based on a unique identifier or combination of columns
Spark architecture enables distributed data processing using resilient distributed datasets (RDDs) and a master-slave model.
Spark consists of a driver program that coordinates the execution of tasks across a cluster.
The cluster manager (like YARN or Mesos) allocates resources for Spark applications.
Data is processed in parallel using RDDs, which are immutable collections of objects.
Spark supports various data sour...
Project architecture defines the structure and components of a data engineering project, ensuring scalability and efficiency.
Define data sources: Identify where data will come from, e.g., databases, APIs, or IoT devices.
Choose a data storage solution: Options include data lakes (e.g., AWS S3) or data warehouses (e.g., Snowflake).
Implement data processing: Use ETL (Extract, Transform, Load) tools like Apache Spark ...
map() and reduce() are higher-order functions used in functional programming to transform and aggregate data respectively.
map() applies a given function to each element of an array and returns a new array with the transformed values.
reduce() applies a given function to the elements of an array in a cumulative way, reducing them to a single value.
I applied via Approached by Company and was interviewed in Sep 2024. There were 3 interview rounds.
Spark memory management optimizes resource allocation for efficient data processing in distributed computing environments.
Spark uses a unified memory management model that divides memory into execution and storage regions.
The default memory fraction for execution is 60%, while 40% is allocated for storage, but these can be configured.
Spark employs a mechanism called 'Tungsten' for off-heap memory management, which redu...
Performance optimization in Spark involves tuning configurations, optimizing code, and utilizing best practices.
Tune Spark configurations such as executor memory, number of executors, and shuffle partitions
Optimize code by reducing unnecessary shuffling, using efficient transformations, and caching intermediate results
Utilize best practices like using data partitioning, avoiding unnecessary data movements, and leveragi...
Transformations and actions are key concepts in Apache Spark for processing data.
Transformations are operations that create a new RDD from an existing one, like map, filter, and reduceByKey.
Actions are operations that trigger computation and return a result to the driver program, like count, collect, and saveAsTextFile.
I have worked at ABC Company as a Data Engineer, where I led projects on data pipeline development and optimization.
Led projects on data pipeline development and optimization
Worked at ABC Company as a Data Engineer
I applied via Naukri.com and was interviewed in Aug 2024. There was 1 interview round.
Query to identify and delete duplicate records in SQL
Use a combination of SELECT and DELETE statements
Identify duplicates using GROUP BY and HAVING clauses
Delete duplicates based on a unique identifier or combination of columns
Spark architecture enables distributed data processing using resilient distributed datasets (RDDs) and a master-slave model.
Spark consists of a driver program that coordinates the execution of tasks across a cluster.
The cluster manager (like YARN or Mesos) allocates resources for Spark applications.
Data is processed in parallel using RDDs, which are immutable collections of objects.
Spark supports various data sources, ...
Various optimisation techniques were used in my project to improve performance and efficiency.
Implemented indexing to speed up database queries
Utilized caching to reduce redundant data retrieval
Applied parallel processing to distribute workloads efficiently
Optimized algorithms to reduce time complexity
Used query optimization techniques to improve database performance
AWS Lambda is a serverless computing service provided by Amazon Web Services.
AWS Lambda allows you to run code without provisioning or managing servers.
It automatically scales based on the incoming traffic.
You only pay for the compute time you consume.
Supports multiple programming languages like Node.js, Python, Java, etc.
Can be triggered by various AWS services like S3, DynamoDB, API Gateway, etc.
Handle incremental data by using tools like Apache Kafka for real-time data streaming and implementing CDC (Change Data Capture) for database updates.
Utilize tools like Apache Kafka for real-time data streaming
Implement CDC (Change Data Capture) for tracking database updates
Use data pipelines to process and integrate incremental data
Ensure data consistency and accuracy during incremental updates
Project architecture defines the structure and components of a data engineering project, ensuring scalability and efficiency.
Define data sources: Identify where data will come from, e.g., databases, APIs, or IoT devices.
Choose a data storage solution: Options include data lakes (e.g., AWS S3) or data warehouses (e.g., Snowflake).
Implement data processing: Use ETL (Extract, Transform, Load) tools like Apache Spark or Ap...
Catalyst optimizer is a query optimization framework in Apache Spark that improves performance by applying various optimization techniques.
It is a query optimization framework in Apache Spark.
It improves performance by applying various optimization techniques.
It leverages techniques like predicate pushdown, column pruning, and constant folding to optimize queries.
Catalyst optimizer generates an optimized logical plan a...
I applied via Approached by Company and was interviewed in Apr 2024. There were 2 interview rounds.
Conducted by CITI Karat. 2 SQL coding was there. Total interview was 1 hr. First 15 min was introduction. 25 min for 1st question and 20 min for 2nd question. The questions are little bit tricky.
Indexes in databases help improve query performance by allowing faster data retrieval.
Types of indexes include clustered, non-clustered, unique, and composite indexes.
Clustered indexes physically reorder the data in the table based on the index key.
Non-clustered indexes create a separate structure that includes the indexed columns and a pointer to the actual data.
Unique indexes ensure that no two rows have the same val...
I applied via LinkedIn and was interviewed in Feb 2024. There were 2 interview rounds.
I appeared for an interview in Sep 2024, where I was asked the following questions.
I applied via Approached by Company and was interviewed in Aug 2023. There were 3 interview rounds.
map() and reduce() are higher-order functions used in functional programming to transform and aggregate data respectively.
map() applies a given function to each element of an array and returns a new array with the transformed values.
reduce() applies a given function to the elements of an array in a cumulative way, reducing them to a single value.
posted on 4 Sep 2023
I applied via LinkedIn and was interviewed in Aug 2023. There were 3 interview rounds.
Creating data pipelines, processing requests, web crawling, scraping, and reading large CSV files in Python.
Use tools like Apache Airflow or Luigi to create data pipelines
Implement distributed computing frameworks like Apache Spark for processing millions of requests
Utilize libraries like Scrapy or Beautiful Soup for web crawling and scraping
Use pandas library in Python to efficiently read and process large CSV files
The Scrum role involves daily activities in development and implementing an automation framework.
As a Data Engineering Specialist, the Scrum role involves participating in daily stand-up meetings to discuss progress and obstacles.
Daily activities may include coding, testing, debugging, and collaborating with team members to deliver high-quality software.
Implementing an automation framework involves creating scripts or ...
I applied via Referral and was interviewed before Feb 2023. There were 2 interview rounds.
Sl query based questions
I applied via Naukri.com and was interviewed before Nov 2023. There were 2 interview rounds.
Average test, I joined when it was Mindtree only.
My expected salary is based on my experience, skills, and the market rate for Data Engineering Specialists.
Consider my years of experience in data engineering
Take into account my specialized skills in data processing and analysis
Research the current market rate for Data Engineering Specialists in this region
I applied via Referral and was interviewed before Sep 2022. There were 4 interview rounds.
Optimizing a report involves identifying inefficiencies and implementing improvements to enhance performance.
Identify key performance indicators (KPIs) to focus on
Streamline data collection and processing methods
Utilize efficient algorithms and data structures
Optimize database queries for faster retrieval
Implement caching mechanisms to reduce processing time
Filters in DAX are used to manipulate data in Power BI reports. DAX calculations are used to create custom measures and columns.
Filters in DAX include CALCULATE, FILTER, ALL, ALLEXCEPT, etc.
DAX calculations are used to create custom measures like SUM, AVERAGE, etc.
Examples: CALCULATE(SUM(Sales), FILTER(Products, Products[Category] = 'Electronics'))
Some of the top questions asked at the LTIMindtree Data Engineering Specialist interview -
The duration of LTIMindtree Data Engineering Specialist interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 11 interview experiences
Difficulty level
Duration
based on 96 reviews
Rating in categories
Senior Software Engineer
22k
salaries
| ₹6 L/yr - ₹23 L/yr |
Software Engineer
16.3k
salaries
| ₹2 L/yr - ₹10 L/yr |
Technical Lead
6.4k
salaries
| ₹9.5 L/yr - ₹37.5 L/yr |
Module Lead
5.7k
salaries
| ₹7 L/yr - ₹28 L/yr |
Senior Engineer
4.4k
salaries
| ₹4.2 L/yr - ₹16 L/yr |
Cognizant
Capgemini
Accenture
TCS