Accenture
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
PySpark architecture is a distributed computing framework that combines Python and Spark to process big data.
PySpark architecture includes a driver program, cluster manager, and worker nodes.
The driver program is responsible for converting the user code into tasks and scheduling them on the worker nodes.
Cluster manager allocates resources and monitors the worker nodes.
Worker nodes execute the tasks and return the resul...
I applied via Naukri.com and was interviewed in Aug 2024. There was 1 interview round.
Spark architecture refers to the structure of Apache Spark, including components like driver, executor, and cluster manager.
Spark architecture consists of a driver program that manages the execution of tasks.
Executors are worker nodes that run tasks and store data in memory or disk.
Cluster manager allocates resources and coordinates tasks across the cluster.
Spark applications run on a cluster of machines managed by a c...
Wide Transformation in pyspark involves shuffling data across partitions, typically used for operations like groupBy.
Wide transformations involve shuffling data across partitions
They are typically used for operations like groupBy, join, and sortByKey
They require data movement and can be more expensive in terms of performance compared to narrow transformations
I applied via Approached by Company and was interviewed in Aug 2024. There was 1 interview round.
What people are saying about Accenture
select is used to select specific columns from a DataFrame, while withColumn is used to add or update columns in a DataFrame.
select is used to select specific columns from a DataFrame
withColumn is used to add or update columns in a DataFrame
select does not modify the original DataFrame, while withColumn returns a new DataFrame with the added/updated column
Example: df.select('col1', 'col2') - selects columns col1 and co...
Variables are used to store values that can be changed, while parameters are used to pass values into activities in ADF.
Variables can be modified within a pipeline, while parameters are set at runtime and cannot be changed within the pipeline.
Variables are defined within a pipeline, while parameters are defined at the pipeline level.
Variables can be used to store intermediate values or results, while parameters are use...
Accenture interview questions for designations
Hackerank, 1 hour duration, python coding and sql
I am a data engineer with a strong background in programming and data analysis.
Experienced in programming languages such as Python, SQL, and Java
Skilled in data manipulation, ETL processes, and data visualization tools
Worked on projects involving big data processing and machine learning algorithms
I want to join Accenture because of their reputation for innovation and their focus on professional development.
Accenture is known for its cutting-edge technology solutions and innovative approach to problem-solving
I am impressed by Accenture's commitment to continuous learning and career growth opportunities
I believe that joining Accenture will provide me with the chance to work on challenging projects and collaborate
Get interview-ready with Top Accenture Interview Questions
Slowly Changing Dimension (SCD) in Informatica is used to track historical data changes in a data warehouse.
SCD Type 1: Overwrite old data with new data
SCD Type 2: Add new row for each change with effective start and end dates
SCD Type 3: Add columns to track changes without adding new rows
Azure tech stack used in the current project includes Azure Data Factory, Azure Databricks, and Azure SQL Database.
Azure Data Factory for data integration and orchestration
Azure Databricks for big data processing and analytics
Azure SQL Database for storing and querying structured data
Mount points are directories in a Unix-like operating system where additional file systems can be attached.
Use the 'mount' command to attach a file system to a directory
Specify the device or file system to be mounted and the directory where it should be attached
Use the 'umount' command to detach a file system from a directory
I applied via Naukri.com and was interviewed in Jun 2024. There was 1 interview round.
An accumulator is a variable used in distributed computing to aggregate values across multiple tasks or nodes.
Accumulators are used in Spark to perform calculations in a distributed manner.
They are read-only variables that can only be updated by an associative and commutative operation.
Accumulators are used for tasks like counting elements or summing values in parallel processing.
Example: counting the number of errors
Persist stores the data in memory and disk, while cache only stores in memory.
Persist stores the data both in memory and disk for fault tolerance and recovery.
Cache only stores the data in memory for faster access.
Persist is used when the data needs to be recovered in case of failure, while cache is used for temporary storage.
Example: persist() in Spark RDD saves data to disk, while cache() stores data in memory for fa
Some of the top questions asked at the Accenture Data Engineer interview -
The duration of Accenture Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 85 interviews
3 Interview rounds
based on 206 reviews
Rating in categories
Application Development Analyst
38.9k
salaries
| ₹0 L/yr - ₹0 L/yr |
Application Development - Senior Analyst
26.9k
salaries
| ₹0 L/yr - ₹0 L/yr |
Team Lead
24.3k
salaries
| ₹0 L/yr - ₹0 L/yr |
Senior Software Engineer
18.2k
salaries
| ₹0 L/yr - ₹0 L/yr |
Software Engineer
17.4k
salaries
| ₹0 L/yr - ₹0 L/yr |
TCS
Cognizant
Capgemini
Infosys