Upload Button Icon Add office photos

Accenture

Compare button icon Compare button icon Compare

Filter interviews by

Accenture Data Engineer Interview Questions, Process, and Tips

Updated 5 Jan 2025

Top Accenture Data Engineer Interview Questions and Answers

  • Q1. What all the optimisation are possible to reduce the overhead of reducing the reading part of large datasets in spark ?
  • Q2. Write a sql query to find the name of person who logged in last within each country from Person Table ?
  • Q3. What to import data from RDMS via sqoop without primary key
View all 72 questions

Accenture Data Engineer Interview Experiences

78 interviews found

I applied via Naukri.com and was interviewed in Apr 2022. There was 1 interview round.

Round 1 - Technical 

(1 Question)

  • Q1. What to import data from RDMS via sqoop without primary key
  • Ans. 

    Use --split-by option in sqoop to import data from RDMS without primary key

    • Use --split-by option to specify a column to split the import into multiple mappers

    • Use --boundary-query option to specify a query to determine the range of values for --split-by column

    • Example: sqoop import --connect jdbc:mysql://localhost/mydb --username root --password password --table mytable --split-by id

    • Example: sqoop import --connect jdbc:m...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Accenture Data Engineer interview:
  • sqoop
  • spark architechture
  • DAG
  • Hive Joins
  • RANK Function
Interview preparation tips for other job seekers - With practical knowledge , mug of some theories as well
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Naukri.com and was interviewed in Dec 2024. There was 1 interview round.

Round 1 - Technical 

(5 Questions)

  • Q1. Scenario based questions on Azure data factory and pipelines
  • Q2. Optimisation technic to improve the performance of databricks
  • Q3. What is Autoloader
  • Q4. What is unity catalog
  • Q5. How you do the alerting mechanism in adf for failed pipelines

Data Engineer Interview Questions Asked at Other Companies

asked in Cisco
Q1. Optimal Strategy for a Coin Game You are playing a coin game with ... read more
asked in Sigmoid
Q2. Next Greater Element Problem Statement You are given an array arr ... read more
asked in Sigmoid
Q3. Problem: Search In Rotated Sorted Array Given a sorted array that ... read more
asked in Cisco
Q4. Covid Vaccination Distribution Problem As the Government ramps up ... read more
asked in LTIMindtree
Q5. 1) If you are given a card with 1-1000 numbers and there are 4 bo ... read more

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 17 Jul 2024

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Recruitment Consulltant and was interviewed in Jun 2024. There was 1 interview round.

Round 1 - One-on-one 

(20 Questions)

  • Q1. Tell me about yourself
  • Q2. Project Architecture
  • Q3. Rate yourself out of 5 in Pyspark , Python and SQL
  • Ans. 

    I would rate myself 4 in Pyspark, 5 in Python, and 4 in SQL.

    • Strong proficiency in Python programming language

    • Experience in working with Pyspark for big data processing

    • Proficient in writing complex SQL queries for data manipulation

    • Familiarity with optimizing queries for performance

    • Hands-on experience in data engineering projects

  • Answered by AI
  • Q4. How to handle duplicates in python ?
  • Ans. 

    Use Python's built-in data structures like sets or dictionaries to handle duplicates.

    • Use a set to remove duplicates from a list: unique_list = list(set(original_list))

    • Use a dictionary to remove duplicates from a list while preserving order: unique_list = list(dict.fromkeys(original_list))

  • Answered by AI
  • Q5. Methods of migrating Hive metdatastore to unity catalog in Databricks ?
  • Ans. 

    Use Databricks provided tools like databricks-connect and databricks-cli to migrate Hive metadata to Unity catalog.

    • Use databricks-connect to connect to the Databricks workspace from your local development environment.

    • Use databricks-cli to export the Hive metadata from the existing Hive metastore.

    • Create a new Unity catalog in Databricks and import the exported metadata using databricks-cli.

    • Validate the migration by chec...

  • Answered by AI
  • Q6. Read a CSV file from ADLS path ?
  • Ans. 

    To read a CSV file from an ADLS path, you can use libraries like pandas or pyspark.

    • Use pandas library in Python to read a CSV file from ADLS path

    • Use pyspark library in Python to read a CSV file from ADLS path

    • Ensure you have the necessary permissions to access the ADLS path

  • Answered by AI
  • Q7. There was a table provided on coding screen and asked to write different programs and SQL queries from the table and tell the approach you are taking ? Like age greater than 30 then sum the age how would y...
  • Q8. How many stages will create from the above code that I have written
  • Ans. 

    The number of stages created from the code provided depends on the specific code and its functionality.

    • The number of stages can vary based on the complexity of the code and the specific tasks being performed.

    • Stages may include data extraction, transformation, loading, and processing.

    • It is important to analyze the code and identify distinct stages to determine the total number.

  • Answered by AI
  • Q9. Narrow vs Wide Transformation ?
  • Ans. 

    Narrow transformation processes one record at a time, while wide transformation processes multiple records at once.

    • Narrow transformation processes one record at a time, making it easier to parallelize and optimize.

    • Wide transformation processes multiple records at once, which can lead to shuffling and performance issues.

    • Examples of narrow transformations include map and filter operations, while examples of wide transfor

  • Answered by AI
  • Q10. What are action and transformation ?
  • Ans. 

    Actions and transformations are key concepts in data engineering, involving the manipulation and processing of data.

    • Actions are operations that trigger the execution of a data transformation job in a distributed computing environment.

    • Transformations are functions that take an input dataset and produce an output dataset, often involving filtering, aggregating, or joining data.

    • Examples of actions include 'saveAsTextFile'...

  • Answered by AI
  • Q11. What happens when we enforce the schema and when we manually define the schema in the code ?
  • Ans. 

    Enforcing the schema ensures data consistency and validation, while manually defining the schema in code allows for more flexibility and customization.

    • Enforcing the schema ensures that all data conforms to a predefined structure and format, preventing errors and inconsistencies.

    • Manually defining the schema in code allows for more flexibility in handling different data types and structures.

    • Enforcing the schema can be do...

  • Answered by AI
  • Q12. What all the optimisation are possible to reduce the overhead of reducing the reading part of large datasets in spark ?
  • Ans. 

    Optimizations like partitioning, caching, and using efficient file formats can reduce overhead in reading large datasets in Spark.

    • Partitioning data based on key can reduce the amount of data shuffled during joins and aggregations

    • Caching frequently accessed datasets in memory can avoid recomputation

    • Using efficient file formats like Parquet or ORC can reduce disk I/O and improve read performance

  • Answered by AI
  • Q13. Write a sql query to find the name of person who logged in last within each country from Person Table ?
  • Ans. 

    SQL query to find the name of person who logged in last within each country from Person Table

    • Use a subquery to find the max login time for each country

    • Join the Person table with the subquery on country and login time to get the name of the person

  • Answered by AI
  • Q14. Difference between List and Tuple ?
  • Ans. 

    List is mutable, Tuple is immutable in Python.

    • List can be modified after creation, Tuple cannot be modified.

    • List is defined using square brackets [], Tuple is defined using parentheses ().

    • Example: list_example = [1, 2, 3], tuple_example = (4, 5, 6)

  • Answered by AI
  • Q15. Difference between Rank , Dense Rank and Row Number and when we are using each of them ?
  • Ans. 

    Rank assigns a unique rank to each row, Dense Rank assigns a unique rank to each distinct row, and Row Number assigns a unique number to each row.

    • Rank assigns the same rank to rows with the same value, leaving gaps in the ranking if there are ties.

    • Dense Rank assigns a unique rank to each distinct row, leaving no gaps in the ranking.

    • Row Number assigns a unique number to each row, without any regard for the values in the...

  • Answered by AI
  • Q16. What is List Comprehension ?
  • Ans. 

    List comprehension is a concise way to create lists in Python by applying an expression to each item in an iterable.

    • Syntax: [expression for item in iterable]

    • Can include conditions: [expression for item in iterable if condition]

    • Example: squares = [x**2 for x in range(10)]

  • Answered by AI
  • Q17. Tell me about the performance optimization done in your project ?
  • Q18. Difference between the interactive cluster and job cluster ?
  • Ans. 

    Interactive clusters allow for real-time interaction and exploration, while job clusters are used for running batch jobs.

    • Interactive clusters are used for real-time data exploration and analysis.

    • Job clusters are used for running batch jobs and processing large amounts of data.

    • Interactive clusters are typically smaller in size and have shorter lifespans.

    • Job clusters are usually larger and more powerful to handle heavy w...

  • Answered by AI
  • Q19. How to add a column in dataframe ? How to rename the column in dataframe ?
  • Ans. 

    To add a column in a dataframe, use the 'withColumn' method. To rename a column, use the 'withColumnRenamed' method.

    • To add a column, use the 'withColumn' method with the new column name and the expression to compute the values for that column.

    • Example: df.withColumn('new_column', df['existing_column'] * 2)

    • To rename a column, use the 'withColumnRenamed' method with the current column name and the new column name.

    • Example:...

  • Answered by AI
  • Q20. Difference between Coalesce and Repartition and In which case we are using it ?
  • Ans. 

    Coalesce is used to combine multiple small partitions into a larger one, while Repartition is used to increase or decrease the number of partitions in a DataFrame.

    • Coalesce reduces the number of partitions in a DataFrame by combining small partitions into larger ones.

    • Repartition increases or decreases the number of partitions in a DataFrame by shuffling the data across partitions.

    • Coalesce is more efficient than Repartit...

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Accenture Data Engineer interview:
  • Spark
  • Databricks
  • SQL
  • Python
  • ETL
Interview preparation tips for other job seekers - Focus on Basics , definitions and Understand the spark internals . Write SQL codes efficiently.

Skills evaluated in this interview

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 15 Oct 2024

Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
2-4 weeks
Result
Selected Selected

I applied via Company Website and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - One-on-one 

(5 Questions)

  • Q1. Union vs union all
  • Ans. 

    Union combines and removes duplicates, while union all combines all rows including duplicates.

    • Union removes duplicates from the result set

    • Union all includes all rows, even duplicates

    • Use union when you want to remove duplicates, use union all when duplicates are needed

  • Answered by AI
  • Q2. Rank vs dense rank
  • Ans. 

    Rank assigns unique ranks to each distinct value, while dense rank assigns consecutive ranks to each distinct value.

    • Rank does not skip ranks when there are ties, while dense rank does

    • Rank may have gaps in the ranking sequence, while dense rank does not

    • Rank is useful when you want to know the exact position of a value in a sorted list, while dense rank is useful when you want to know the relative position of a value com

  • Answered by AI
  • Q3. Facts vs dimensions table
  • Ans. 

    Facts tables contain numerical data while dimensions tables contain descriptive attributes.

    • Facts tables store quantitative data like sales revenue or quantity sold

    • Dimensions tables store descriptive attributes like product name or customer details

    • Facts tables are typically used for analysis and reporting, while dimensions tables provide context for the facts

  • Answered by AI
  • Q4. Basics of Databricks
  • Q5. Lambda in python
  • Ans. 

    Lambda functions in Python are anonymous functions that can have any number of arguments but only one expression.

    • Lambda functions are defined using the lambda keyword.

    • They are commonly used for small, one-time tasks.

    • Lambda functions can be used as arguments to higher-order functions like map, filter, and reduce.

  • Answered by AI

Interview Preparation Tips

Topics to prepare for Accenture Data Engineer interview:
  • Azure Databricks
  • Advanced sql
  • Python
  • Pyspark

Skills evaluated in this interview

Accenture interview questions for designations

 Senior Data Engineer

 (13)

 Big Data Engineer

 (3)

 Lead Data Engineer

 (1)

 Data Architect

 (2)

 Azure Data Engineer

 (8)

 Data Engineer 1

 (3)

 Gcp Data Engineer

 (3)

 Associate Data Engineer

 (3)

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 13 Dec 2024

Interview experience
1
Bad
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(1 Question)

  • Q1. GCP Data Engineer Concepts. Please don't waste your time giving interviews at accenture.

Interview Preparation Tips

Interview preparation tips for other job seekers - I recently had an interview with Accenture, and although I am proficient in interviews, I feel this company is wasting our time. They seem to conduct interviews merely for appearances. I had two offers in hand that I did not disclose, and I am aware of my technical abilities. However, this company wasted my time by conducting interviews and then rejecting candidates. I want to highlight this to fellow job seekers: please do not waste your time.

Get interview-ready with Top Accenture Interview Questions

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 22 Nov 2024

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Approached by Company and was interviewed in Oct 2024. There were 2 interview rounds.

Round 1 - Coding Test 

It was 60 mins test where there were 11 MCQ 3 SQL and 1 python questions

Round 2 - Technical 

(2 Questions)

  • Q1. End to end databricks code to read the multiple files from adls and writing it into a single file
  • Ans. 

    Use Databricks code to read multiple files from ADLS and write into a single file

    • Use Databricks File System (DBFS) to access files in ADLS

    • Read multiple files using Spark's read method

    • Combine the dataframes using union or merge

    • Write the combined dataframe to a single file using Spark's write method

  • Answered by AI
  • Q2. Pyspark architecture

Skills evaluated in this interview

Data Engineer Jobs at Accenture

View all

Data Engineer Interview Questions & Answers

user image Dipak Rout

posted on 21 Nov 2024

Interview experience
5
Excellent
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Technical 

(2 Questions)

  • Q1. Describe about your project
  • Q2. Describe about spark architecture
  • Ans. 

    Spark architecture is a distributed computing framework that provides high-level APIs for various languages.

    • Spark architecture consists of a cluster manager, worker nodes, and a driver program.

    • It uses Resilient Distributed Datasets (RDDs) for fault-tolerant distributed data processing.

    • Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object.

    • It supports various data so...

  • Answered by AI

Skills evaluated in this interview

Interview experience
3
Average
Difficulty level
Easy
Process Duration
Less than 2 weeks
Result
No response

I applied via Naukri.com and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Python- remove duplicate from set
  • Ans. 

    Use set() function to remove duplicates from a list in Python.

    • Convert the list to a set using set() function

    • Convert the set back to a list to remove duplicates

    • Example: list_with_duplicates = ['a', 'b', 'a', 'c']; list_without_duplicates = list(set(list_with_duplicates))

  • Answered by AI
  • Q2. Pyspark- add column with default value

Skills evaluated in this interview

Interview experience
4
Good
Difficulty level
-
Process Duration
-
Result
No response
Round 1 - Technical 

(1 Question)

  • Q1. SQL Based Questions
Round 2 - Technical 

(1 Question)

  • Q1. Spark Question like repartitioning

Data Engineer Interview Questions & Answers

user image Anonymous

posted on 10 Oct 2024

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Not Selected

I applied via Job Portal and was interviewed in Sep 2024. There was 1 interview round.

Round 1 - Technical 

(2 Questions)

  • Q1. Scala, ADB, ADF, Synapse
  • Q2. Concepts ina detailed way should be analysed.
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
6-8 weeks
Result
No response

I applied via Company Website and was interviewed in Sep 2024. There were 2 interview rounds.

Round 1 - Coding Test 

30 questions were there, Mostly SQL question

Round 2 - One-on-one 

(2 Questions)

  • Q1. Explain databricks
  • Ans. 

    Databricks is a unified analytics platform that combines data engineering, data science, and business analytics.

    • Databricks provides a collaborative workspace for data engineers, data scientists, and business analysts to work together on big data projects.

    • It integrates with popular tools like Apache Spark for data processing and machine learning.

    • Databricks offers automated cluster management and scaling to handle large ...

  • Answered by AI
  • Q2. Clusters Types in databricks
  • Ans. 

    There are two types of clusters in Databricks: Standard and High Concurrency.

    • Standard clusters are used for single user workloads and are terminated when not in use.

    • High Concurrency clusters are used for multiple users and remain active even when not in use.

    • Both types of clusters can be configured with different sizes and auto-scaling options.

  • Answered by AI

Skills evaluated in this interview

Accenture Interview FAQs

How many rounds are there in Accenture Data Engineer interview?
Accenture interview process usually has 1-2 rounds. The most common rounds in the Accenture interview process are Technical, HR and One-on-one Round.
How to prepare for Accenture Data Engineer interview?
Go through your CV in detail and study all the technologies mentioned in your CV. Prepare at least two technologies or languages in depth if you are appearing for a technical interview at Accenture. The most common topics and skills that interviewers at Accenture expect are SQL, Data Warehousing, Data Quality, Python and Data Modeling.
What are the top questions asked in Accenture Data Engineer interview?

Some of the top questions asked at the Accenture Data Engineer interview -

  1. What all the optimisation are possible to reduce the overhead of reducing the r...read more
  2. Write a sql query to find the name of person who logged in last within each cou...read more
  3. What to import data from RDMS via sqoop without primary ...read more
How long is the Accenture Data Engineer interview process?

The duration of Accenture Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.

Tell us how to improve this page.

Accenture Data Engineer Interview Process

based on 85 interviews

3 Interview rounds

  • Technical Round - 1
  • Technical Round - 2
  • HR Round
View more

Data Engineer Interview Questions from Similar Companies

View all
Accenture Data Engineer Salary
based on 2.4k salaries
₹2.8 L/yr - ₹15.6 L/yr
13% less than the average Data Engineer Salary in India
View more details

Accenture Data Engineer Reviews and Ratings

based on 205 reviews

4.0/5

Rating in categories

3.9

Skill development

3.9

Work-life balance

3.4

Salary

3.8

Job security

3.9

Company culture

3.1

Promotions

3.6

Work satisfaction

Explore 205 Reviews and Ratings
Data Engineer

Pune

3-8 Yrs

Not Disclosed

Data Engineer

Nagpur

7-12 Yrs

Not Disclosed

Data Engineer

Bangalore / Bengaluru

12-14 Yrs

Not Disclosed

Explore more jobs
Application Development Analyst
38.9k salaries
unlock blur

₹3 L/yr - ₹12 L/yr

Application Development - Senior Analyst
27k salaries
unlock blur

₹6.9 L/yr - ₹17.5 L/yr

Team Lead
24.3k salaries
unlock blur

₹7.1 L/yr - ₹25.6 L/yr

Senior Software Engineer
18.2k salaries
unlock blur

₹6 L/yr - ₹19.5 L/yr

Software Engineer
17.4k salaries
unlock blur

₹3.6 L/yr - ₹13.4 L/yr

Explore more salaries
Compare Accenture with

TCS

3.7
Compare

Cognizant

3.8
Compare

Capgemini

3.7
Compare

Infosys

3.6
Compare
Did you find this page helpful?
Yes No
write
Share an Interview