Upload Button Icon Add office photos

Filter interviews by

Everestdx Azure Data Engineer Interview Questions and Answers

Updated 22 Mar 2024

Everestdx Azure Data Engineer Interview Experiences

1 interview found

Interview experience
4
Good
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via LinkedIn and was interviewed in Feb 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. How do you optimize pyspark jobs?
  • Ans. 

    Optimizing pyspark jobs involves tuning configurations, partitioning data, caching, and using efficient transformations.

    • Tune configurations such as executor memory, number of executors, and parallelism to optimize performance.

    • Partition data properly to distribute workload evenly and avoid shuffling.

    • Cache intermediate results to avoid recomputation.

    • Use efficient transformations like map, filter, and reduceByKey instead ...

  • Answered by AI
  • Q2. How do you write stored procedures in databricks?
  • Ans. 

    Stored procedures in Databricks can be written using SQL or Python.

    • Use %sql magic command to write SQL stored procedures

    • Use %python magic command to write Python stored procedures

    • Stored procedures can be saved and executed in Databricks notebooks

  • Answered by AI

Skills evaluated in this interview

Interview questions from similar companies

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Recruitment Consulltant and was interviewed in Aug 2024. There were 3 interview rounds.

Round 1 - Technical 

(4 Questions)

  • Q1. Lets say you have table 1 with values 1,2,3,5,null,null,0 and table 2 has null,2,4,7,3,5 What would be the output after inner join?
  • Ans. 

    The output after inner join of table 1 and table 2 will be 2,3,5.

    • Inner join only includes rows that have matching values in both tables.

    • Values 2, 3, and 5 are present in both tables, so they will be included in the output.

    • Null values are not considered as matching values in inner join.

  • Answered by AI
  • Q2. Lets say you have customers table with customerID and customer name, Orders table with OrderId and CustomerID. write a query to find the customer name who placed the maximum orders. if more than one person...
  • Q3. Spark Architecture, Optimisation techniques
  • Q4. Some personal questions.
Round 2 - Technical 

(5 Questions)

  • Q1. Explain the entire architecture of a recent project you are working on in your organisation.
  • Ans. 

    The project involves building a data pipeline to ingest, process, and analyze large volumes of data from various sources in Azure.

    • Utilizing Azure Data Factory for data ingestion and orchestration

    • Implementing Azure Databricks for data processing and transformation

    • Storing processed data in Azure Data Lake Storage

    • Using Azure Synapse Analytics for data warehousing and analytics

    • Leveraging Azure DevOps for CI/CD pipeline aut

  • Answered by AI
  • Q2. How do you design an effective ADF pipeline and what all metrics and considerations you should keep in mind while designing?
  • Ans. 

    Designing an effective ADF pipeline involves considering various metrics and factors.

    • Understand the data sources and destinations

    • Identify the dependencies between activities

    • Optimize data movement and processing for performance

    • Monitor and track pipeline execution for troubleshooting

    • Consider security and compliance requirements

    • Use parameterization and dynamic content for flexibility

    • Implement error handling and retries fo

  • Answered by AI
  • Q3. Lets say you have a very huge data volume and in terms of performance how would you slice and dice the data in such a way that you can boost the performance?
  • Q4. Lets say you have to reconstruct a table and we have to preserve the historical data ? ( i couldnt answer that but please refer to SCD)
  • Q5. We have adf and databricks both, i can achieve transformation , fetching the data and loading the dimension layer using adf also but why do we use databricks if both have the similar functionality for few ...
Round 3 - HR 

(1 Question)

  • Q1. Basic HR questions

Interview Preparation Tips

Topics to prepare for Tech Mahindra Azure Data Engineer interview:
  • SQL
  • Databricks
  • Azure Data Factory
  • Pyspark
  • Spark
Interview preparation tips for other job seekers - The interviewers were really nice.

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Selected Selected

I applied via Naukri.com and was interviewed in Oct 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Rdd,adf IR,databricks,spark architecture
  • Q2. Agile,project related questions
Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Activities used in ADF
  • Ans. 

    Activities in Azure Data Factory (ADF) are the building blocks of a pipeline and perform various tasks like data movement, data transformation, and data orchestration.

    • Activities can be used to copy data from one location to another (Copy Activity)

    • Activities can be used to transform data using mapping data flows (Data Flow Activity)

    • Activities can be used to run custom code or scripts (Custom Activity)

    • Activities can be u...

  • Answered by AI
  • Q2. Dataframes in pyspark
  • Ans. 

    Dataframes in pyspark are distributed collections of data organized into named columns.

    • Dataframes are similar to tables in a relational database, with rows and columns.

    • They can be created from various data sources like CSV, JSON, Parquet, etc.

    • Dataframes support SQL queries and transformations using PySpark functions.

    • Example: df = spark.read.csv('file.csv')

  • Answered by AI
Round 2 - HR 

(2 Questions)

  • Q1. Managerial Questions
  • Q2. About project roles and resposibilities

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
Selected Selected

I applied via Naukri.com

Round 1 - Technical 

(4 Questions)

  • Q1. Based on my previous company projects
  • Q2. SQL based questions are asked
  • Q3. ADF based questions are asked
  • Q4. Azure related questions are asked
Round 2 - HR 

(1 Question)

  • Q1. Reg salary discussion

Interview Preparation Tips

Interview preparation tips for other job seekers - NA
Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. Spark Architecture
Round 2 - Technical 

(2 Questions)

  • Q1. Explain your project
  • Q2. Remove duplicates
  • Ans. 

    Use DISTINCT keyword in SQL to remove duplicates from a dataset.

    • Use SELECT DISTINCT column_name FROM table_name to retrieve unique values from a specific column.

    • Use SELECT DISTINCT * FROM table_name to retrieve unique rows from the entire table.

    • Use GROUP BY clause with COUNT() function to remove duplicates based on specific criteria.

  • Answered by AI
Round 3 - HR 

(1 Question)

  • Q1. Salary expectations

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Recruitment Consulltant and was interviewed in Mar 2024. There was 1 interview round.

Round 1 - Technical 

(4 Questions)

  • Q1. How are you connecting your onPerm from Azure?
  • Ans. 

    I connect onPrem to Azure using Azure ExpressRoute or VPN Gateway.

    • Use Azure ExpressRoute for private connection through a dedicated connection.

    • Set up a VPN Gateway for secure connection over the internet.

    • Ensure proper network configurations and security settings.

    • Use Azure Virtual Network Gateway to establish the connection.

    • Consider using Azure Site-to-Site VPN for connecting onPremises network to Azure Virtual Network.

  • Answered by AI
  • Q2. What is Autoloader in Databricks?
  • Ans. 

    Autoloader in Databricks is a feature that automatically loads new data files as they arrive in a specified directory.

    • Autoloader monitors a specified directory for new data files and loads them into a Databricks table.

    • It supports various file formats such as CSV, JSON, Parquet, Avro, and ORC.

    • Autoloader simplifies the process of ingesting streaming data into Databricks without the need for manual intervention.

    • It can be ...

  • Answered by AI
  • Q3. How do you normalize your Json data
  • Ans. 

    Json data normalization involves structuring data to eliminate redundancy and improve efficiency.

    • Identify repeating groups of data

    • Create separate tables for each group

    • Establish relationships between tables using foreign keys

    • Eliminate redundant data by referencing shared values

  • Answered by AI
  • Q4. How do you read from Kafka?

Interview Preparation Tips

Interview preparation tips for other job seekers - Focus on core technical

Skills evaluated in this interview

Interview experience
1
Bad
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Company Website and was interviewed in Apr 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Challenges faced in production deployment
  • Ans. 

    Challenges in production deployment include scalability, data consistency, and monitoring.

    • Ensuring scalability to handle increasing data volumes and user loads

    • Maintaining data consistency across different databases and systems

    • Implementing effective monitoring and alerting to quickly identify and resolve issues

  • Answered by AI
  • Q2. Tell me about your project
  • Ans. 

    Developed a data pipeline to ingest, process, and analyze customer feedback data for a retail company.

    • Used Azure Data Factory to orchestrate data flow

    • Implemented Azure Databricks for data processing and analysis

    • Utilized Azure Synapse Analytics for data warehousing

    • Generated visualizations using Power BI for insights

    • Implemented machine learning models for sentiment analysis

  • Answered by AI

Skills evaluated in this interview

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
4-6 weeks
Result
No response

I applied via Referral and was interviewed in May 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. What is polybase?
  • Ans. 

    Polybase is a feature in Azure SQL Data Warehouse that allows users to query data stored in Hadoop or Azure Blob Storage.

    • Polybase enables users to access and query external data sources without moving the data into the database.

    • It provides a virtualization layer that allows SQL queries to seamlessly integrate with data stored in Hadoop or Azure Blob Storage.

    • Polybase can significantly improve query performance by levera...

  • Answered by AI
  • Q2. Explain your current project architecture.
Interview experience
4
Good
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
-

I applied via Naukri.com and was interviewed in May 2024. There were 2 interview rounds.

Round 1 - One-on-one 

(5 Questions)

  • Q1. Project Architecture, spark transformations used?
  • Ans. 

    The project architecture includes Spark transformations for processing large volumes of data.

    • Spark transformations are used to manipulate data in distributed computing environments.

    • Examples of Spark transformations include map, filter, reduceByKey, join, etc.

  • Answered by AI
  • Q2. Advanced SQL questions - highest sales from each city
  • Ans. 

    Use window functions like ROW_NUMBER() to find highest sales from each city in SQL.

    • Use PARTITION BY clause in ROW_NUMBER() to partition data by city

    • Order the data by sales in descending order

    • Filter the results to only include rows with row number 1

  • Answered by AI
  • Q3. Data modelling - Star schema, Snowflake schema, Dimension and Fact tables
  • Q4. Databricks - how to mount?
  • Ans. 

    Databricks can be mounted using the Databricks CLI or the Databricks REST API.

    • Use the Databricks CLI command 'databricks fs mount' to mount a storage account to a Databricks workspace.

    • Alternatively, you can use the Databricks REST API to programmatically mount storage.

  • Answered by AI
  • Q5. Questions on ADF - pipeline used in the project
Round 2 - One-on-one 

(1 Question)

  • Q1. Questions on Databricks - optimizations, history, autoloader, liquid clustering, autoscaling

Skills evaluated in this interview

Everestdx Interview FAQs

How many rounds are there in Everestdx Azure Data Engineer interview?
Everestdx interview process usually has 1 rounds. The most common rounds in the Everestdx interview process are Technical.
What are the top questions asked in Everestdx Azure Data Engineer interview?

Some of the top questions asked at the Everestdx Azure Data Engineer interview -

  1. How do you write stored procedures in databric...read more
  2. How do you optimize pyspark jo...read more

Tell us how to improve this page.

People are getting interviews through

based on 1 Everestdx interview
Job Portal
100%
Low Confidence
?
Low Confidence means the data is based on a small number of responses received from the candidates.
Senior Software Engineer
10 salaries
unlock blur

₹7.2 L/yr - ₹17.1 L/yr

Associate Trainee
7 salaries
unlock blur

₹4.1 L/yr - ₹5 L/yr

Senior Developer
7 salaries
unlock blur

₹8.4 L/yr - ₹13 L/yr

Devops Engineer
6 salaries
unlock blur

₹4 L/yr - ₹8.9 L/yr

Cloud Engineer
6 salaries
unlock blur

₹6.8 L/yr - ₹9 L/yr

Explore more salaries
Compare Everestdx with

Thyrocare Technologies

3.5
Compare

Metropolis Healthcare

4.1
Compare

DRJ & CO

5.0
Compare

SRL Diagnostics

4.1
Compare

Calculate your in-hand salary

Confused about how your in-hand salary is calculated? Enter your annual salary (CTC) and get your in-hand salary
Did you find this page helpful?
Yes No
write
Share an Interview