Filter interviews by
I applied via Walk-in and was interviewed before Jan 2024. There was 1 interview round.
I applied via Naukri.com and was interviewed in Nov 2024. There were 2 interview rounds.
I applied via Recruitment Consulltant and was interviewed in Aug 2024. There were 3 interview rounds.
The output after inner join of table 1 and table 2 will be 2,3,5.
Inner join only includes rows that have matching values in both tables.
Values 2, 3, and 5 are present in both tables, so they will be included in the output.
Null values are not considered as matching values in inner join.
The project involves building a data pipeline to ingest, process, and analyze large volumes of data from various sources in Azure.
Utilizing Azure Data Factory for data ingestion and orchestration
Implementing Azure Databricks for data processing and transformation
Storing processed data in Azure Data Lake Storage
Using Azure Synapse Analytics for data warehousing and analytics
Leveraging Azure DevOps for CI/CD pipeline aut
Designing an effective ADF pipeline involves considering various metrics and factors.
Understand the data sources and destinations
Identify the dependencies between activities
Optimize data movement and processing for performance
Monitor and track pipeline execution for troubleshooting
Consider security and compliance requirements
Use parameterization and dynamic content for flexibility
Implement error handling and retries fo
I applied via Company Website and was interviewed in Dec 2024. There was 1 interview round.
I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.
Activities in Azure Data Factory (ADF) are the building blocks of a pipeline and perform various tasks like data movement, data transformation, and data orchestration.
Activities can be used to copy data from one location to another (Copy Activity)
Activities can be used to transform data using mapping data flows (Data Flow Activity)
Activities can be used to run custom code or scripts (Custom Activity)
Activities can be u...
Dataframes in pyspark are distributed collections of data organized into named columns.
Dataframes are similar to tables in a relational database, with rows and columns.
They can be created from various data sources like CSV, JSON, Parquet, etc.
Dataframes support SQL queries and transformations using PySpark functions.
Example: df = spark.read.csv('file.csv')
I applied via Naukri.com
I applied via Recruitment Consulltant and was interviewed in Mar 2024. There was 1 interview round.
I connect onPrem to Azure using Azure ExpressRoute or VPN Gateway.
Use Azure ExpressRoute for private connection through a dedicated connection.
Set up a VPN Gateway for secure connection over the internet.
Ensure proper network configurations and security settings.
Use Azure Virtual Network Gateway to establish the connection.
Consider using Azure Site-to-Site VPN for connecting onPremises network to Azure Virtual Network.
Autoloader in Databricks is a feature that automatically loads new data files as they arrive in a specified directory.
Autoloader monitors a specified directory for new data files and loads them into a Databricks table.
It supports various file formats such as CSV, JSON, Parquet, Avro, and ORC.
Autoloader simplifies the process of ingesting streaming data into Databricks without the need for manual intervention.
It can be ...
Json data normalization involves structuring data to eliminate redundancy and improve efficiency.
Identify repeating groups of data
Create separate tables for each group
Establish relationships between tables using foreign keys
Eliminate redundant data by referencing shared values
I applied via Referral and was interviewed in May 2024. There was 1 interview round.
Polybase is a feature in Azure SQL Data Warehouse that allows users to query data stored in Hadoop or Azure Blob Storage.
Polybase enables users to access and query external data sources without moving the data into the database.
It provides a virtualization layer that allows SQL queries to seamlessly integrate with data stored in Hadoop or Azure Blob Storage.
Polybase can significantly improve query performance by levera...
Use DISTINCT keyword in SQL to remove duplicates from a dataset.
Use SELECT DISTINCT column_name FROM table_name to retrieve unique values from a specific column.
Use SELECT DISTINCT * FROM table_name to retrieve unique rows from the entire table.
Use GROUP BY clause with COUNT() function to remove duplicates based on specific criteria.
based on 2 reviews
Rating in categories
TCS
Accenture
Wipro
Cognizant