Deloitte
AGIO Pharmaceuticals Interview Questions and Answers
Q1. Explain about copy activity in ADF Slowly changing dimensions Data warehousing
Copy activity in ADF is used to move data from source to destination.
Copy activity supports various sources and destinations such as Azure Blob Storage, Azure SQL Database, etc.
It can be used for both one-time and scheduled data movement.
It supports mapping data between source and destination using mapping data flows.
Slowly changing dimensions can be handled using copy activity in ADF.
Copy activity is commonly used in data warehousing scenarios.
Q2. what are the difference b/w data lake gen1 and gen2
Data Lake Gen1 is based on Hadoop Distributed File System (HDFS) while Gen2 is built on Azure Blob Storage.
Data Lake Gen1 uses HDFS for storing data while Gen2 uses Azure Blob Storage.
Gen1 has a hierarchical file system while Gen2 has a flat file system.
Gen2 provides better performance, scalability, and security compared to Gen1.
Gen2 supports Azure Data Lake Storage features like tiering, lifecycle management, and access control lists (ACLs).
Gen2 allows direct access to data ...read more
Q3. SQL query and difference between rank,dense rank and row number
Rank, dense rank, and row number are SQL functions used to assign a unique sequential number to rows in a result set.
Rank function assigns a unique number to each row based on the ordering specified in the query.
Dense rank function also assigns a unique number to each row, but it does not leave gaps in the ranking sequence.
Row number function simply assigns a sequential number to each row in the result set, without any consideration of the order.
Q4. Difference between olap and oltp
OLAP is for analytics and reporting while OLTP is for transaction processing.
OLAP stands for Online Analytical Processing
OLTP stands for Online Transaction Processing
OLAP is used for complex queries and data analysis
OLTP is used for real-time transaction processing
OLAP databases are read-intensive while OLTP databases are write-intensive
Examples of OLAP databases include data warehouses and data marts
Examples of OLTP databases include banking systems and e-commerce websites
Q5. Difference between dataframe and rdd
Dataframe is a distributed collection of data organized into named columns while RDD is a distributed collection of data organized into partitions.
Dataframe is immutable while RDD is mutable
Dataframe has a schema while RDD does not
Dataframe is optimized for structured and semi-structured data while RDD is optimized for unstructured data
Dataframe has better performance than RDD due to its optimized execution engine
Dataframe supports SQL queries while RDD does not
Q6. What is polybase?
Polybase is a feature in Azure SQL Data Warehouse that allows users to query data stored in Hadoop or Azure Blob Storage.
Polybase enables users to access and query external data sources without moving the data into the database.
It provides a virtualization layer that allows SQL queries to seamlessly integrate with data stored in Hadoop or Azure Blob Storage.
Polybase can significantly improve query performance by leveraging the parallel processing capabilities of Hadoop or Azu...read more
More about working at Deloitte
Interview Process at AGIO Pharmaceuticals
Top Azure Data Engineer Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month