Upload Button Icon Add office photos

Accenture

Compare button icon Compare button icon Compare

Filter interviews by

Accenture Data Engineering Analyst Interview Questions and Answers for Experienced

Updated 29 Dec 2024

14 Interview questions

A Data Engineering Analyst was asked
Q. Suppose you are adding a block that takes a significant amount of time. How would you start debugging it?
Ans. 

To debug a slow block, start by identifying potential bottlenecks, analyzing logs, checking for errors, and profiling the code.

  • Identify potential bottlenecks in the code or system that could be causing the slow performance.

  • Analyze logs and error messages to pinpoint any issues or exceptions that may be occurring.

  • Use profiling tools to analyze the performance of the code and identify areas that need optimization.

  • Ch...

A Data Engineering Analyst was asked
Q. Explain Airflow and its internal architecture.
Ans. 

Airflow is a platform to programmatically author, schedule, and monitor workflows.

  • Airflow is written in Python and uses Directed Acyclic Graphs (DAGs) to define workflows.

  • It has a web-based UI for visualization and monitoring of workflows.

  • Airflow consists of a scheduler, a metadata database, a web server, and an executor.

  • Tasks in Airflow are defined as operators, which determine what actually gets executed.

  • Example...

Data Engineering Analyst Interview Questions Asked at Other Companies for Experienced

asked in Accenture
Q1. Given an Employee table with columns Employee name, Salary, and D ... read more
asked in Accenture
Q2. You have 200 Petabytes of data to load. How will you decide the n ... read more
asked in Accenture
Q3. Suppose there is a file with 100 columns, and you only want to lo ... read more
asked in Accenture
Q4. Given a list of strings, how would you determine the frequency of ... read more
asked in Accenture
Q5. Suppose you are adding a block that takes a significant amount of ... read more
A Data Engineering Analyst was asked
Q. What is pre-partitioning?
Ans. 

Prepartition is the process of dividing data into smaller partitions before performing any operations on it.

  • Prepartitioning helps in improving query performance by reducing the amount of data that needs to be processed.

  • It can also help in distributing data evenly across multiple nodes in a distributed system.

  • Examples include partitioning a large dataset based on a specific column like date or region before running...

A Data Engineering Analyst was asked
Q. You have 200 Petabytes of data to load. How will you decide the number of executors required, considering the data is out of cache?
Ans. 

The number of executors required to load 200 Petabytes of data depends on the size of each executor and the available cache.

  • Calculate the size of each executor based on available resources and data size

  • Consider the amount of cache available for data processing

  • Determine the optimal number of executors based on the above factors

What people are saying about Accenture

View All
a junior software engineer
2w
Job offer in Malaysia - legit or scam?
Hey everyone, I received a job proposal from Mindgraph for a Junior Mainframe Developer position in Malaysia (onsite). Not sure if it's a real deal. They found my resume on Naukri and the offer includes: * Experience: 3+ years on cardlink, VSAM, CICS, JCL * Location: Malaysia (Accenture client in Kuala Lumpur) * Notice: 0-60 days * Benefits: One-way ticket, 1-week stay, medical insurance, visa. Has anyone heard of Mindgraph or had a similar experience? Note : This is a permanent position with Mindgragh and you need to work with our client Accenture - Malaysia (Kaula Lumpur) & we will provide one way Air Ticket from India - Malaysia, 1 Week Accommodation, Medical Insurance and will take care of the Visa process also. Any insights would be appreciated!
Got a question about Accenture?
Ask anonymously on communities.
A Data Engineering Analyst was asked
Q. What are case classes in Python?
Ans. 

Case classes in Python are classes that are used to create immutable objects for pattern matching and data modeling.

  • Case classes are typically used in functional programming to represent data structures.

  • They are immutable, meaning their values cannot be changed once they are created.

  • Case classes automatically define equality, hash code, and toString methods based on the class constructor arguments.

  • They are commonl...

A Data Engineering Analyst was asked
Q. What is an RDD in Spark?
Ans. 

RDD stands for Resilient Distributed Dataset in Spark, which is an immutable distributed collection of objects.

  • RDD is the fundamental data structure in Spark, representing a collection of elements that can be operated on in parallel.

  • RDDs are fault-tolerant, meaning they can automatically recover from failures.

  • RDDs support two types of operations: transformations (creating a new RDD from an existing one) and action...

A Data Engineering Analyst was asked
Q. Given an Employee table with columns Employee name, Salary, and Department, write a PySpark query to find the name of the employee with the second highest salary in each department.
Ans. 

Find the 2nd highest salary employee in each department using PySpark.

  • Read the CSV file into a DataFrame using spark.read.csv().

  • Group the DataFrame by 'Department' and use the 'dense_rank()' function to rank salaries.

  • Filter the DataFrame to get employees with a rank of 2.

  • Select the 'Employee name' and 'Department' columns for the final output.

Are these interview questions helpful?
A Data Engineering Analyst was asked
Q. Define RDD Lineage and its process.
Ans. 

RDD Lineage is the record of transformations applied to an RDD and the dependencies between RDDs.

  • RDD Lineage tracks the sequence of transformations applied to an RDD from its source data.

  • It helps in fault tolerance by allowing RDDs to be reconstructed in case of data loss.

  • RDD Lineage is used in Spark to optimize the execution plan by eliminating unnecessary computations.

  • Example: If an RDD is created from a text fi...

A Data Engineering Analyst was asked
Q. What do you mean by broadcast variables?
Ans. 

Broadcast Variables are read-only shared variables that are cached on each machine in a Spark cluster rather than being sent with tasks.

  • Broadcast Variables are used to efficiently distribute large read-only datasets to all worker nodes in a Spark cluster.

  • They are useful for tasks that require the same data to be shared across multiple stages of a job.

  • Broadcast Variables are created using the broadcast() method in ...

A Data Engineering Analyst was asked
Q. Given a list of strings, how would you determine the frequency of each unique string value? For example, given the input ['a', 'a', 'a', 'b', 'b', 'c'], the expected output is a:3, b:2, c:1.
Ans. 

Calculate the frequency of each unique string in an array and display the results.

  • Use a dictionary to count occurrences: {'a': 3, 'b': 2, 'c': 1}.

  • Iterate through the list and update counts for each character.

  • Example: For input ['a', 'a', 'b'], output should be 'a,2' and 'b,1'.

  • Utilize collections.Counter for a more concise solution.

Accenture Data Engineering Analyst Interview Experiences for Experienced

5 interviews found

Interview experience
4
Good
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via Referral and was interviewed in Aug 2023. There were 2 interview rounds.

Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.
View all tips
Round 2 - Technical 

(15 Questions)

  • Q1. Introduce your self and Explain Your Project and your Role?
  • Q2. Explain Airflow with its Internal Architecture?
  • Ans. 

    Airflow is a platform to programmatically author, schedule, and monitor workflows.

    • Airflow is written in Python and uses Directed Acyclic Graphs (DAGs) to define workflows.

    • It has a web-based UI for visualization and monitoring of workflows.

    • Airflow consists of a scheduler, a metadata database, a web server, and an executor.

    • Tasks in Airflow are defined as operators, which determine what actually gets executed.

    • Example: A D...

  • Answered by AI
  • Q3. What is RDD in Spark?
  • Ans. 

    RDD stands for Resilient Distributed Dataset in Spark, which is an immutable distributed collection of objects.

    • RDD is the fundamental data structure in Spark, representing a collection of elements that can be operated on in parallel.

    • RDDs are fault-tolerant, meaning they can automatically recover from failures.

    • RDDs support two types of operations: transformations (creating a new RDD from an existing one) and actions (tr...

  • Answered by AI
  • Q4. Define RDD Lineage and its Process
  • Ans. 

    RDD Lineage is the record of transformations applied to an RDD and the dependencies between RDDs.

    • RDD Lineage tracks the sequence of transformations applied to an RDD from its source data.

    • It helps in fault tolerance by allowing RDDs to be reconstructed in case of data loss.

    • RDD Lineage is used in Spark to optimize the execution plan by eliminating unnecessary computations.

    • Example: If an RDD is created from a text file an...

  • Answered by AI
  • Q5. What do you mean by broadcast Variables?
  • Ans. 

    Broadcast Variables are read-only shared variables that are cached on each machine in a Spark cluster rather than being sent with tasks.

    • Broadcast Variables are used to efficiently distribute large read-only datasets to all worker nodes in a Spark cluster.

    • They are useful for tasks that require the same data to be shared across multiple stages of a job.

    • Broadcast Variables are created using the broadcast() method in Spark...

  • Answered by AI
  • Q6. What is Broadcasting are you using Broadcasting and what is the limitation of broadcasting?
  • Ans. 

    Broadcasting is a technique used in Apache Spark to optimize data transfer by sending smaller data to all nodes in a cluster.

    • Broadcasting is used to efficiently distribute read-only data to all nodes in a cluster to avoid unnecessary data shuffling.

    • It is commonly used when joining large datasets with smaller lookup tables.

    • Broadcast variables are cached in memory and reused across multiple stages of a Spark job.

    • The limi...

  • Answered by AI
  • Q7. Are you using acumulator and Explain cathelyst optimizer
  • Ans. 

    Accumulators are used for aggregating values across tasks, while Catalyst optimizer is a query optimizer for Apache Spark.

    • Accumulators are variables that are only added to through an associative and commutative operation and can be used to implement counters or sums.

    • Catalyst optimizer is a rule-based query optimizer that leverages advanced programming language features to build an extensible query optimizer.

    • Catalyst op...

  • Answered by AI
  • Q8. Suppose you adding a block and that takes much time you have to debug it how you start the debug ?
  • Ans. 

    To debug a slow block, start by identifying potential bottlenecks, analyzing logs, checking for errors, and profiling the code.

    • Identify potential bottlenecks in the code or system that could be causing the slow performance.

    • Analyze logs and error messages to pinpoint any issues or exceptions that may be occurring.

    • Use profiling tools to analyze the performance of the code and identify areas that need optimization.

    • Check f...

  • Answered by AI
  • Q9. You have to 200 Petabyte of data to load how you will decide the number of executor required ?out of cache you have
  • Ans. 

    The number of executors required to load 200 Petabytes of data depends on the size of each executor and the available cache.

    • Calculate the size of each executor based on available resources and data size

    • Consider the amount of cache available for data processing

    • Determine the optimal number of executors based on the above factors

  • Answered by AI
  • Q10. What is prepartition ?
  • Ans. 

    Prepartition is the process of dividing data into smaller partitions before performing any operations on it.

    • Prepartitioning helps in improving query performance by reducing the amount of data that needs to be processed.

    • It can also help in distributing data evenly across multiple nodes in a distributed system.

    • Examples include partitioning a large dataset based on a specific column like date or region before running anal...

  • Answered by AI
  • Q11. Sql Query Table Name Employee column Employee name Salary Department first read this csv file and then write the query in pyspark to find out the name of the employee whose salary is 2nd highest in eac...
  • Ans. 

    Find the 2nd highest salary employee in each department using PySpark.

    • Read the CSV file into a DataFrame using spark.read.csv().

    • Group the DataFrame by 'Department' and use the 'dense_rank()' function to rank salaries.

    • Filter the DataFrame to get employees with a rank of 2.

    • Select the 'Employee name' and 'Department' columns for the final output.

  • Answered by AI
  • Q12. Suppose you have string values now you have to find out the frequency of values ? For Example like input ['a' ,'a' ,'a', 'b', 'b', 'c' ] output a,3 b,2 c,1
  • Ans. 

    Calculate the frequency of each unique string in an array and display the results.

    • Use a dictionary to count occurrences: {'a': 3, 'b': 2, 'c': 1}.

    • Iterate through the list and update counts for each character.

    • Example: For input ['a', 'a', 'b'], output should be 'a,2' and 'b,1'.

    • Utilize collections.Counter for a more concise solution.

  • Answered by AI
  • Q13. What is case classes in python ?
  • Ans. 

    Case classes in Python are classes that are used to create immutable objects for pattern matching and data modeling.

    • Case classes are typically used in functional programming to represent data structures.

    • They are immutable, meaning their values cannot be changed once they are created.

    • Case classes automatically define equality, hash code, and toString methods based on the class constructor arguments.

    • They are commonly use...

  • Answered by AI
  • Q14. Suppose there is 100 column in a file i just want to only load 10 column from 100 column how you approach this?
  • Ans. 

    To load specific columns from a file, use data processing tools to filter the required columns efficiently.

    • Use libraries like Pandas in Python: `df = pd.read_csv('file.csv', usecols=['col1', 'col2', ...])`.

    • In SQL, you can specify columns in your SELECT statement: `SELECT col1, col2 FROM table_name;`.

    • For CSV files, tools like awk can be used: `awk -F, '{print $1,$2,...}' file.csv`.

    • In ETL processes, configure the extract...

  • Answered by AI
  • Q15. What is lambda Architecture and lambda function?
  • Ans. 

    Lambda Architecture is a data processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream processing methods. Lambda function is a small anonymous function that can take any number of arguments, but can only have one expression.

    • Lambda Architecture combines batch processing and stream processing to handle large amounts of data efficiently.

    • Batch layer stores and proc...

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare more around Pyspark and SQL

Skills evaluated in this interview

Interview experience
1
Bad
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Coding Test 

Coding in python use many tools scikit learn dashboarding such as tableau additionally I am skilled in ML

Interview Preparation Tips

Interview preparation tips for other job seekers - Practice your skills and do better for the better tomorrow
Interview experience
3
Average
Difficulty level
-
Process Duration
More than 8 weeks
Result
Selected Selected

I applied via Naukri.com and was interviewed before Feb 2023. There were 2 interview rounds.

Round 1 - Technical 

(1 Question)

  • Q1. There was only one technical round and questions where from SQL number joins questions and tool related questions
Round 2 - HR 

(1 Question)

  • Q1. CTC discussion with HR and offer Letter was released after submitting all documents
Interview experience
3
Average
Difficulty level
-
Process Duration
-
Result
-
Round 1 - Resume Shortlist 
Pro Tip by AmbitionBox:
Keep your resume crisp and to the point. A recruiter looks at your resume for an average of 6 seconds, make sure to leave the best impression.
View all tips
Round 2 - Coding Test 

2 questions on basics of DS and algo. easy and medium level included.

Round 3 - Technical 

(2 Questions)

  • Q1. Basic concept of OOP, data types in python , C
  • Q2. Explain your personal project briefly

Interview Preparation Tips

Interview preparation tips for other job seekers - DA ,algo basics, along with the personal project is enough
Interview experience
5
Excellent
Difficulty level
Moderate
Process Duration
Less than 2 weeks
Result
Selected Selected

I applied via LinkedIn and was interviewed before Mar 2023. There were 3 interview rounds.

Round 1 - Aptitude Test 

That was great and easy

Round 2 - Coding Test 

Gave 2 codes
Difficult level is medium

Round 3 - Technical 

(1 Question)

  • Q1. Ask about project

Interview questions from similar companies

Interview Questionnaire 

4 Questions

  • Q1. Techincal round
  • Q2. Package and other formal discussion
  • Q3. They test the stress level
  • Q4. How would you handle a p1 situation
  • Ans. 

    Handle a P1 situation by prioritizing the issue, communicating effectively, and collaborating with team members.

    • Prioritize the issue based on impact and urgency

    • Communicate with stakeholders about the situation and potential solutions

    • Collaborate with team members to address the issue efficiently

  • Answered by AI

Interview Preparation Tips

Round: Resume Shortlist
Experience: They look for requirements from your resume

Round: Test
Experience: Com test

Interview Questionnaire 

2 Questions

  • Q1. OOPS Concepts, System engineering
  • Q2. You need to work all the rotational shift without shift allowance?
  • Ans. 

    It would be challenging to work all rotational shifts without shift allowance.

    • Working all rotational shifts without shift allowance can lead to burnout and decreased job satisfaction.

    • It may be difficult to maintain work-life balance without shift allowance.

    • Financial compensation for working rotational shifts is a common practice in many industries.

    • Without shift allowance, employees may feel undervalued and demotivated.

    • ...

  • Answered by AI

Interview Preparation Tips

Round: Test
Experience: Aptitude

Are these interview questions helpful?

Interview Questionnaire 

3 Questions

  • Q1. Why Capgemini?
  • Ans. 

    Capgemini offers a diverse range of projects and opportunities for growth, with a strong focus on innovation and collaboration.

    • Capgemini has a global presence, providing opportunities to work on projects with clients from various industries and regions.

    • The company values innovation and encourages employees to think creatively and implement new ideas.

    • Capgemini promotes a collaborative work environment, where teamwork an...

  • Answered by AI
  • Q2. Why should we select you?
  • Ans. 

    I have a strong analytical background, proven track record of delivering results, and excellent communication skills.

    • I have a Master's degree in Business Analytics and 5+ years of experience in data analysis.

    • I have consistently exceeded performance targets in my previous roles by utilizing advanced analytical techniques.

    • I have excellent communication skills, which allow me to effectively present complex data insights t...

  • Answered by AI
  • Q3. Why engineering?
  • Ans. 

    Engineering allows me to apply problem-solving skills to create innovative solutions and make a positive impact on society.

    • Passion for problem-solving and innovation

    • Desire to make a positive impact on society through technology

    • Interest in applying scientific principles to real-world challenges

  • Answered by AI

Interview Preparation Tips

Round: Test
Duration: 1 hour 30 minutes

Round: Technical Interview
Experience: Questions about projects done, no coding questions (depends on stream).
More of a conversation than question answer session.

Skills: Aptitude, Communication And Confidence
College Name: NIT Raipur

I applied via Campus Placement and was interviewed before Nov 2018. There were 4 interview rounds.

Interview Questionnaire 

4 Questions

  • Q1. Questions on OOP? Inheritance and applications to real life
  • Q2. Sorting and complexity based questions. Learn a bit of Big O.
  • Q3. Some SQL and relational database commands/properties..ACID... ER diagram is important
  • Q4. Goals and targets in a career in IT.
  • Ans. 

    Goals in IT include career advancement, skill development, and contributing to innovative projects.

    • Career advancement through promotions or moving to higher positions

    • Skill development through training, certifications, and learning new technologies

    • Contributing to innovative projects that make a difference in the industry

    • Setting personal goals and targets to measure progress and success

    • Networking and building relationshi...

  • Answered by AI

Interview Preparation Tips

Interview preparation tips for other job seekers - Its pretty simple.... Prepare on aptitude and basic coding concepts... Be confident on DB concepts and sorting/searching algo.... Also be a little peepared on your final year projects/internship projects...

I applied via Company Website and was interviewed before Sep 2019. There were 5 interview rounds.

Interview Questionnaire 

2 Questions

  • Q1. What is Active Directory? Difference between MDM and MAM. What do you know about Microsoft Intune? How does Azure Active directory work? What is Exchange Active Sync
  • Q2. It's basically the test of your domain knowledge. So be confident of what you say and always try to be logical. Try to use examples, real-life scenarios to explain your points.

Interview Preparation Tips

Interview preparation tips for other job seekers - Prepare well. Follow the Basics. Try not to memorize focus rather on the concept. Start exploring and learn. Create a test account and practice what you study.
Key Point: Never Lose faith in yourself if you are not selected. Failure is not the end of the road. Its a self-help call to realize you that whatever you know is not sufficient enough and its the time now to upgrade.
Have patience If a company doesn't hire you. Someone who is much better than the last one will hire the better and much-upgraded version of you.
Good Luck :)

Skills evaluated in this interview

Accenture Interview FAQs

How many rounds are there in Accenture Data Engineering Analyst interview for experienced candidates?
Accenture interview process for experienced candidates usually has 2-3 rounds. The most common rounds in the Accenture interview process for experienced candidates are Technical, Coding Test and Resume Shortlist.
What are the top questions asked in Accenture Data Engineering Analyst interview for experienced candidates?

Some of the top questions asked at the Accenture Data Engineering Analyst interview for experienced candidates -

  1. Sql Query Table Name Employee column Employee name Salary Department first r...read more
  2. You have to 200 Petabyte of data to load how you will decide the number of exe...read more
  3. Suppose there is 100 column in a file i just want to only load 10 column from 1...read more

Tell us how to improve this page.

Overall Interview Experience Rating

3.2/5

based on 5 interview experiences

Difficulty level

Moderate 100%

Duration

Less than 2 weeks 67%
More than 8 weeks 33%
View more

Interview Questions from Similar Companies

TCS Interview Questions
3.6
 • 11.1k Interviews
Infosys Interview Questions
3.6
 • 7.9k Interviews
Wipro Interview Questions
3.7
 • 6.1k Interviews
Cognizant Interview Questions
3.7
 • 5.9k Interviews
Capgemini Interview Questions
3.7
 • 5.1k Interviews
Tech Mahindra Interview Questions
3.5
 • 4.1k Interviews
HCLTech Interview Questions
3.5
 • 4.1k Interviews
Genpact Interview Questions
3.7
 • 3.4k Interviews
IBM Interview Questions
4.0
 • 2.5k Interviews
DXC Technology Interview Questions
3.7
 • 841 Interviews
View all
Accenture Data Engineering Analyst Salary
based on 2.9k salaries
₹5 L/yr - ₹11 L/yr
9% less than the average Data Engineering Analyst Salary in India
View more details

Accenture Data Engineering Analyst Reviews and Ratings

based on 226 reviews

3.8/5

Rating in categories

3.9

Skill development

3.6

Work-life balance

3.1

Salary

3.8

Job security

3.8

Company culture

2.7

Promotions

3.5

Work satisfaction

Explore 226 Reviews and Ratings
Application Development Analyst
39.3k salaries
unlock blur

₹4.8 L/yr - ₹11 L/yr

Application Development - Senior Analyst
27.7k salaries
unlock blur

₹8.3 L/yr - ₹16.1 L/yr

Team Lead
26.5k salaries
unlock blur

₹12.6 L/yr - ₹22.5 L/yr

Senior Analyst
19.5k salaries
unlock blur

₹9 L/yr - ₹15.7 L/yr

Senior Software Engineer
18.5k salaries
unlock blur

₹10.4 L/yr - ₹18 L/yr

Explore more salaries
Compare Accenture with

TCS

3.6
Compare

Cognizant

3.7
Compare

Capgemini

3.7
Compare

Infosys

3.6
Compare
write
Share an Interview