Add office photos
Capgemini logo
Engaged Employer

Capgemini

Verified
3.7
based on 41.7k Reviews
Video summary
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
Clear (1)

Capgemini Azure Data Engineer Interview Questions and Answers

Updated 21 Jan 2025

Q1. How to read parquet file, how to call notebook from adf, Azure Devops CI/CD Process, system variables in adf

Ans.

Answering questions related to Azure Data Engineer interview

  • To read parquet file, use PyArrow or Pandas library

  • To call notebook from ADF, use Notebook activity in ADF pipeline

  • For Azure DevOps CI/CD process, use Azure Pipelines

  • System variables in ADF can be accessed using expressions like @pipeline().RunId

Add your answer
right arrow

Q2. difference between persist and cache in pyspark?

Ans.

Persist and cache are both used for optimizing performance in PySpark, but persist stores data in memory and/or disk while cache only stores data in memory.

  • Persist allows you to specify storage level (memory, disk, etc.) while cache only stores data in memory

  • Persist is more flexible in terms of storage options compared to cache

  • Persist is used when you want to store data in memory and/or disk for future reuse, while cache is used for temporary storage in memory only

Add your answer
right arrow

Q3. How you migrated oracle data into azure?

Ans.

I migrated Oracle data into Azure using Azure Data Factory and Azure Database Migration Service.

  • Used Azure Data Factory to create pipelines for data migration

  • Utilized Azure Database Migration Service for schema and data migration

  • Ensured data consistency and integrity during the migration process

Add your answer
right arrow

Q4. SDC 1 and SDC 2 in ADF explain with example

Ans.

SDC 1 and SDC 2 in ADF are Self-Hosted Integration Runtimes used for data movement in Azure Data Factory.

  • SDC 1 and SDC 2 are Self-Hosted Integration Runtimes that allow data movement between on-premises and cloud data stores in Azure Data Factory.

  • SDC 1 is used for data movement in ADF pipelines and can be installed on an on-premises server.

  • SDC 2 is an updated version of SDC 1 with improved performance and scalability.

  • Both SDC 1 and SDC 2 provide secure and efficient data tran...read more

Add your answer
right arrow
Discover Capgemini interview dos and don'ts from real experiences

Q5. read a csv file in pyspark

Ans.

Read a CSV file in PySpark

  • Use SparkSession to create a Spark DataFrame from the CSV file

  • Specify the file path and format when reading the CSV file

  • Use options like 'header' and 'inferSchema' to read the CSV file correctly

Add your answer
right arrow

Q6. Remove duplicates

Ans.

Use DISTINCT keyword in SQL to remove duplicates from a dataset.

  • Use SELECT DISTINCT column_name FROM table_name to retrieve unique values from a specific column.

  • Use SELECT DISTINCT * FROM table_name to retrieve unique rows from the entire table.

  • Use GROUP BY clause with COUNT() function to remove duplicates based on specific criteria.

Add your answer
right arrow
Contribute & help others!
Write a review
Write a review
Share interview
Share interview
Contribute salary
Contribute salary
Add office photos
Add office photos

Interview Process at Capgemini Azure Data Engineer

based on 10 interviews
3 Interview rounds
Technical Round - 1
Technical Round - 2
HR Round
View more
interview tips and stories logo
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Top Azure Data Engineer Interview Questions from Similar Companies

TCS Logo
3.7
 • 15 Interview Questions
View all
Recently Viewed
DESIGNATION
Pyspark Developer
25 interviews
SALARIES
Concentrix Corporation
No Salaries
INTERVIEWS
Axis Bank
No Interviews
CAMPUS PLACEMENT
Mumbai University
INTERVIEWS
Axis Bank
50 top interview questions
INTERVIEWS
Capgemini
80 top interview questions
INTERVIEWS
Axis Bank
No Interviews
SALARIES
Concentrix Corporation
SALARIES
Concentrix Corporation
SALARIES
Concentrix Corporation
Share an Interview
Stay ahead in your career. Get AmbitionBox app
play-icon
play-icon
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
70 Lakh+

Reviews

5 Lakh+

Interviews

4 Crore+

Salaries

1 Cr+

Users/Month

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter