i
Capgemini
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
I was interviewed in Mar 2024.
Set spark configuration with appropriate memory and cores for efficient processing of 2 GB data
Increase executor memory and cores to handle larger data size
Adjust spark memory overhead to prevent out of memory errors
Optimize shuffle partitions for better performance
Use dbutils.notebook.run() command to run a child notebook in a parent notebook
Use dbutils.notebook.run() command with the path to the child notebook and any parameters needed
Ensure that the child notebook is accessible and has necessary permissions
Handle any return values or errors from the child notebook appropriately
I applied via Referral and was interviewed in Jun 2024. There was 1 interview round.
Databricks and Azure Synapse Notebook are both cloud-based platforms for data engineering and analytics.
Databricks is primarily focused on big data processing and machine learning, while Azure Synapse Notebook is part of a larger analytics platform.
Databricks provides a collaborative environment for data scientists and engineers to work together, while Azure Synapse Notebook is integrated with other Azure services for ...
What people are saying about Capgemini
I applied via Naukri.com and was interviewed in Feb 2024. There were 2 interview rounds.
BQ stands for BigQuery, a fully managed, serverless, and highly scalable cloud data warehouse provided by Google Cloud.
Advantages of BigQuery include fast query performance due to its distributed architecture
Scalability to handle large datasets without the need for infrastructure management
Integration with other Google Cloud services like Dataflow, Dataproc, and Data Studio
Support for standard SQL queries and real-time...
Capgemini interview questions for designations
I applied via Referral and was interviewed in Mar 2024. There were 2 interview rounds.
Snowflake architecture is a cloud-based data warehousing solution that separates storage and compute resources for scalability and performance.
Snowflake uses a unique architecture with three layers: storage, compute, and services.
Storage layer stores data in a columnar format for efficient querying.
Compute layer processes queries independently, allowing for elastic scalability.
Services layer manages metadata, security,...
SQL joins are used to combine rows from two or more tables based on a related column between them.
Use INNER JOIN to return rows when there is at least one match in both tables
Use LEFT JOIN to return all rows from the left table, and the matched rows from the right table
Use RIGHT JOIN to return all rows from the right table, and the matched rows from the left table
Use FULL JOIN to return rows when there is a match in on
Get interview-ready with Top Capgemini Interview Questions
Time series
logical reasoing
family tree
Spark is a distributed computing system that provides an interface for programming clusters with implicit data parallelism.
Spark is built on the concept of Resilient Distributed Datasets (RDDs), which are fault-tolerant collections of objects.
It supports various programming languages such as Scala, Java, Python, and R.
Spark provides high-level APIs for distributed data processing, including transformations and actions.
...
Lazy evaluation is a strategy used by Spark to delay the execution of transformations until an action is called.
Lazy evaluation improves performance by optimizing the execution plan
Transformations in Spark are not executed immediately, but rather recorded as a lineage graph
Actions trigger the execution of the transformations and produce a result
Lazy evaluation allows Spark to optimize the execution plan by combining an...
Left join returns all records from the left table and the matching records from the right table.
Inner join returns only the matching records from both tables.
Left join includes all records from the left table, even if there are no matches in the right table.
Inner join excludes the non-matching records from both tables.
Left join is used to retrieve all records from one table and the matching records from another table.
I...
Project structure refers to the organization of files, folders, and resources within a project.
Project structure should be logical and easy to navigate
Common structures include separating code into modules, organizing files by type (e.g. scripts, data, documentation), and using version control
Example: A data engineering project may have folders for data extraction, transformation, loading, and documentation
The duration of Capgemini Data Engineer interview process can vary, but typically it takes about less than 2 weeks to complete.
based on 40 interviews
2 Interview rounds
based on 112 reviews
Rating in categories
Consultant
55.2k
salaries
| ₹5.2 L/yr - ₹17.5 L/yr |
Associate Consultant
50.8k
salaries
| ₹3 L/yr - ₹10 L/yr |
Senior Consultant
46.1k
salaries
| ₹7.5 L/yr - ₹24.5 L/yr |
Senior Analyst
20.6k
salaries
| ₹2 L/yr - ₹7.5 L/yr |
Senior Software Engineer
20.2k
salaries
| ₹3.5 L/yr - ₹12.1 L/yr |
Wipro
Accenture
Cognizant
TCS