Filter interviews by
Copy activity is a tool in Azure Data Factory used to move data between data stores.
Copy activity is a feature in Azure Data Factory that allows you to move data between supported data stores.
It supports various data sources and destinations such as Azure Blob Storage, Azure SQL Database, and more.
You can define data movement tasks using pipelines in Azure Data Factory and monitor the progress of copy activities.
Activities in Azure Data Factory (ADF) are the building blocks of a pipeline and perform various tasks like data movement, data transformation, and data orchestration.
Activities can be used to copy data from one location to another (Copy Activity)
Activities can be used to transform data using mapping data flows (Data Flow Activity)
Activities can be used to run custom code or scripts (Custom Activity)
Activities can...
Project roles and responsibilities define team members' tasks and accountability in Azure Data Engineering projects.
Data Engineer: Designs and implements data pipelines; e.g., using Azure Data Factory for ETL processes.
Data Architect: Defines data models and architecture; e.g., creating a schema in Azure SQL Database.
Data Analyst: Analyzes data and generates insights; e.g., using Power BI for reporting.
Project Man...
Azure IR stands for Azure Integration Runtime, which is a data integration service in Azure Data Factory.
Azure IR is used to provide data integration capabilities across different network environments.
It allows data movement between cloud and on-premises data sources.
Azure IR can be configured to run data integration activities in Azure Data Factory pipelines.
It supports different types of data integration activit...
What people are saying about Accenture
SCD Type 1 is a method of updating data in a data warehouse by overwriting existing data with new information.
Overwrites existing data with new information
No historical data is kept
Simplest and fastest method of updating data
Explaining record counts for different types of joins in datasets.
Inner Join: Returns records with matching values in both datasets. Example: 5 records in A and 5 in B with 3 matches = 3 records.
Left Join: Returns all records from the left dataset and matched records from the right. Example: 5 in A, 5 in B, 3 matches = 5 records.
Right Join: Returns all records from the right dataset and matched records from the le...
Azure Data Factory is a cloud-based data integration service that allows you to create, schedule, and manage data pipelines.
Azure Data Factory is used to move and transform data from various sources to destinations.
It supports data integration and orchestration of workflows.
You can monitor and manage data pipelines using Azure Data Factory.
It provides a visual interface for designing and monitoring data pipelines.
...
An index in a table is a data structure that improves the speed of data retrieval operations on a database table.
Indexes are used to quickly locate data without having to search every row in a table.
They can be created on one or more columns in a table.
Examples of indexes include primary keys, unique constraints, and non-unique indexes.
Azure Data Lake is a scalable data storage and analytics service provided by Microsoft Azure.
Azure Data Lake Store is a secure data repository that allows you to store and analyze petabytes of data.
Azure Data Lake Analytics is a distributed analytics service that can process big data using Apache Hadoop and Apache Spark.
It is designed for big data processing and analytics tasks, providing high performance and scal...
Linked services in ADF are connections to external data sources or destinations that allow data movement and transformation.
Linked services are used to connect to various data sources such as databases, file systems, and cloud services.
They provide the necessary information and credentials to establish a connection.
Linked services enable data movement activities like copying data from one source to another or tran...
Activities in Azure Data Factory (ADF) are the building blocks of a pipeline and perform various tasks like data movement, data transformation, and data orchestration.
Activities can be used to copy data from one location to another (Copy Activity)
Activities can be used to transform data using mapping data flows (Data Flow Activity)
Activities can be used to run custom code or scripts (Custom Activity)
Activities can be u...
Dataframes in pyspark are distributed collections of data organized into named columns.
Dataframes are similar to tables in a relational database, with rows and columns.
They can be created from various data sources like CSV, JSON, Parquet, etc.
Dataframes support SQL queries and transformations using PySpark functions.
Example: df = spark.read.csv('file.csv')
Project roles and responsibilities define team members' tasks and accountability in Azure Data Engineering projects.
Data Engineer: Designs and implements data pipelines; e.g., using Azure Data Factory for ETL processes.
Data Architect: Defines data models and architecture; e.g., creating a schema in Azure SQL Database.
Data Analyst: Analyzes data and generates insights; e.g., using Power BI for reporting.
Project Manager:...
I appeared for an interview in Oct 2024.
Most of the questions were from resume and scenario-based. The basic SQL and Pyspark Questions
I applied via Naukri.com and was interviewed in Mar 2024. There was 1 interview round.
Azure Data Factory is a cloud-based data integration service that allows you to create, schedule, and manage data pipelines.
Azure Data Factory is used to move and transform data from various sources to destinations.
It supports data integration and orchestration of workflows.
You can monitor and manage data pipelines using Azure Data Factory.
It provides a visual interface for designing and monitoring data pipelines.
Azure...
Azure Data Lake is a scalable data storage and analytics service provided by Microsoft Azure.
Azure Data Lake Store is a secure data repository that allows you to store and analyze petabytes of data.
Azure Data Lake Analytics is a distributed analytics service that can process big data using Apache Hadoop and Apache Spark.
It is designed for big data processing and analytics tasks, providing high performance and scalabili...
An index in a table is a data structure that improves the speed of data retrieval operations on a database table.
Indexes are used to quickly locate data without having to search every row in a table.
They can be created on one or more columns in a table.
Examples of indexes include primary keys, unique constraints, and non-unique indexes.
I appeared for an interview in May 2024.
Copy activity is a tool in Azure Data Factory used to move data between data stores.
Copy activity is a feature in Azure Data Factory that allows you to move data between supported data stores.
It supports various data sources and destinations such as Azure Blob Storage, Azure SQL Database, and more.
You can define data movement tasks using pipelines in Azure Data Factory and monitor the progress of copy activities.
I applied via Referral and was interviewed in Sep 2023. There were 4 interview rounds.
Data masking in Azure helps protect sensitive information by replacing original data with fictitious data.
Use Dynamic Data Masking in Azure SQL Database to obfuscate sensitive data in real-time
Leverage Azure Purview to discover, classify, and mask sensitive data across various data sources
Implement Azure Data Factory to transform and mask data during ETL processes
Utilize Azure Information Protection to apply encryption...
Azure IR stands for Azure Integration Runtime, which is a data integration service in Azure Data Factory.
Azure IR is used to provide data integration capabilities across different network environments.
It allows data movement between cloud and on-premises data sources.
Azure IR can be configured to run data integration activities in Azure Data Factory pipelines.
It supports different types of data integration activities s...
SCD Type 1 is a method of updating data in a data warehouse by overwriting existing data with new information.
Overwrites existing data with new information
No historical data is kept
Simplest and fastest method of updating data
I applied via Naukri.com and was interviewed in Aug 2023. There were 2 interview rounds.
Scheduled trigger is time-based while tumbling window trigger is data-based.
Scheduled trigger is based on a specific time or interval, such as every hour or every day.
Tumbling window trigger is based on the arrival of new data or a specific event.
Scheduled trigger is useful for regular data processing tasks, like ETL jobs.
Tumbling window trigger is useful for aggregating data over fixed time intervals.
Scheduled trigger...
Control flow activities in Azure Data Factory (ADF) are used to define the workflow and execution order of activities.
Control flow activities are used to manage the flow of data and control the execution order of activities in ADF.
They allow you to define dependencies between activities and specify conditions for their execution.
Some commonly used control flow activities in ADF are If Condition, For Each, Until, and Sw...
Linked services in ADF are connections to external data sources or destinations that allow data movement and transformation.
Linked services are used to connect to various data sources such as databases, file systems, and cloud services.
They provide the necessary information and credentials to establish a connection.
Linked services enable data movement activities like copying data from one source to another or transform...
IR stands for Integration Runtime. There are two types of IR: Self-hosted and Azure-SSIS.
Self-hosted IR is used to connect to on-premises data sources.
Azure-SSIS IR is used to run SSIS packages in Azure Data Factory.
Self-hosted IR requires an on-premises machine to be installed and configured.
Azure-SSIS IR is a fully managed service provided by Azure.
Both types of IR enable data movement and transformation in Azure Dat...
We should use the Self-hosted Integration Runtime (IR) to copy data from on-premise db to Azure.
Self-hosted IR allows data movement between on-premise and Azure
It is installed on a local machine or virtual machine in the on-premise network
Self-hosted IR securely connects to the on-premise data source and transfers data to Azure
It supports various data sources like SQL Server, Oracle, MySQL, etc.
Self-hosted IR can be ma...
I applied via Referral and was interviewed in Nov 2023. There was 1 interview round.
Explaining record counts for different types of joins in datasets.
Inner Join: Returns records with matching values in both datasets. Example: 5 records in A and 5 in B with 3 matches = 3 records.
Left Join: Returns all records from the left dataset and matched records from the right. Example: 5 in A, 5 in B, 3 matches = 5 records.
Right Join: Returns all records from the right dataset and matched records from the left. E...
I applied via Company Website and was interviewed before Feb 2020. There was 1 interview round.
I applied via Job Portal and was interviewed before Dec 2019. There was 1 interview round.
based on 8 interview experiences
Difficulty level
Duration
based on 27 reviews
Rating in categories
Application Development Analyst
39.3k
salaries
| ₹4.8 L/yr - ₹11 L/yr |
Application Development - Senior Analyst
27.7k
salaries
| ₹8.3 L/yr - ₹16.1 L/yr |
Team Lead
26.7k
salaries
| ₹12.6 L/yr - ₹22.5 L/yr |
Senior Analyst
19.6k
salaries
| ₹9 L/yr - ₹15.7 L/yr |
Senior Software Engineer
18.5k
salaries
| ₹10.4 L/yr - ₹18 L/yr |
TCS
Cognizant
Capgemini
Infosys