i
NTT Data
Filter interviews by
I was interviewed in Aug 2023.
I applied via Naukri.com and was interviewed before Nov 2023. There were 3 interview rounds.
What people are saying about NTT Data
I applied via Naukri.com and was interviewed in Dec 2024. There was 1 interview round.
posted on 31 Dec 2024
Apache Spark architecture includes a cluster manager, worker nodes, and driver program.
Apache Spark architecture consists of a cluster manager, which allocates resources and schedules tasks.
Worker nodes execute tasks and store data in memory or disk.
Driver program coordinates tasks and communicates with the cluster manager.
Spark applications run as independent sets of processes on a cluster, coordinated by the SparkCon...
reduceBy is used to aggregate data based on key, while groupBy is used to group data based on key.
reduceBy is a transformation that combines the values of each key using an associative function and a neutral 'zero value'.
groupBy is a transformation that groups the data based on a key and returns a grouped data set.
reduceBy is more efficient for aggregating data as it reduces the data before shuffling, while groupBy shu...
RDD is a low-level abstraction representing a distributed collection of objects, while DataFrame is a higher-level abstraction representing a distributed collection of data organized into named columns.
RDD is more suitable for unstructured data and low-level transformations, while DataFrame is more suitable for structured data and high-level abstractions.
DataFrames provide optimizations like query optimization and code...
The different modes of execution in Apache Spark include local mode, standalone mode, YARN mode, and Mesos mode.
Local mode: Spark runs on a single machine with one executor.
Standalone mode: Spark runs on a cluster managed by a standalone cluster manager.
YARN mode: Spark runs on a Hadoop cluster using YARN as the resource manager.
Mesos mode: Spark runs on a Mesos cluster with Mesos as the resource manager.
I applied via Recruitment Consulltant and was interviewed in Nov 2024. There were 2 interview rounds.
Different types of joins available in Databricks include inner join, outer join, left join, right join, and cross join.
Inner join: Returns only the rows that have matching values in both tables.
Outer join: Returns all rows when there is a match in either table.
Left join: Returns all rows from the left table and the matched rows from the right table.
Right join: Returns all rows from the right table and the matched rows ...
Implementing fault tolerance in a data pipeline involves redundancy, monitoring, and error handling.
Use redundant components to ensure continuous data flow
Implement monitoring tools to detect failures and bottlenecks
Set up automated alerts for immediate response to issues
Design error handling mechanisms to gracefully handle failures
Use checkpoints and retries to ensure data integrity
AutoLoader is a feature in data engineering that automatically loads data from various sources into a data warehouse or database.
Automates the process of loading data from different sources
Reduces manual effort and human error
Can be scheduled to run at specific intervals
Examples: Apache Nifi, AWS Glue
To connect to different services in Azure, you can use Azure SDKs, REST APIs, Azure Portal, Azure CLI, and Azure PowerShell.
Use Azure SDKs for programming languages like Python, Java, C#, etc.
Utilize REST APIs to interact with Azure services programmatically.
Access and manage services through the Azure Portal.
Leverage Azure CLI for command-line interface interactions.
Automate tasks using Azure PowerShell scripts.
Linked Services are connections to external data sources or destinations in Azure Data Factory.
Linked Services define the connection information needed to connect to external data sources or destinations.
They can be used in Data Factory pipelines to read from or write to external systems.
Examples of Linked Services include Azure Blob Storage, Azure SQL Database, and Amazon S3.
ADF questions refer to Azure Data Factory questions which are related to data integration and data transformation processes.
ADF questions are related to Azure Data Factory, a cloud-based data integration service.
These questions may involve data pipelines, data flows, activities, triggers, and data movement.
Candidates may be asked about their experience with designing, monitoring, and managing data pipelines in ADF.
Exam...
I applied via LinkedIn and was interviewed in Nov 2024. There was 1 interview round.
Aptitude test involved with quantative aptitude, logical reasoning and reading comprehensions.
I have strong skills in data processing, ETL, data modeling, and programming languages like Python and SQL.
Proficient in data processing and ETL techniques
Strong knowledge of data modeling and database design
Experience with programming languages like Python and SQL
Familiarity with big data technologies such as Hadoop and Spark
Yes, I am open to relocating for the right opportunity.
I am willing to relocate for the right job opportunity.
I have experience moving for previous roles.
I am flexible and adaptable to new locations.
I am excited about the possibility of exploring a new city or country.
2 Interview rounds
based on 9 reviews
Rating in categories
Software Engineer
935
salaries
| ₹2.8 L/yr - ₹11 L/yr |
Senior Associate
761
salaries
| ₹1.2 L/yr - ₹7.3 L/yr |
Network Engineer
654
salaries
| ₹1.8 L/yr - ₹10 L/yr |
Software Developer
615
salaries
| ₹2.5 L/yr - ₹13 L/yr |
Senior Software Engineer
510
salaries
| ₹6.5 L/yr - ₹25.5 L/yr |
Tata Communications
Bharti Airtel
Reliance Communications
Vodafone Idea