i
Capgemini
Proud winner of ABECA 2024 - AmbitionBox Employee Choice Awards
Filter interviews by
To recommend customers to migrate to the cloud, assess their current infrastructure, plan the migration strategy, choose the right cloud provider, and ensure data security.
Assess the customer's current infrastructure and identify the applications and data that can be migrated to the cloud.
Plan the migration strategy by considering factors like cost, time, and resource requirements.
Choose the right cloud provider based ...
I applied via campus placement at RC Patel College of Education, Shirpur and was interviewed in Oct 2024. There were 3 interview rounds.
There are some general aptitude questions.
There were two simple codes from which we need to pass the test case for at least one code
The most difficult subject in college was Advanced Calculus.
Advanced Calculus involved complex mathematical concepts and required a deep understanding of calculus principles.
The subject required a lot of practice and problem-solving skills to master the concepts.
Topics such as multivariable calculus, differential equations, and vector calculus were particularly challenging.
The abstract nature of the subject made it dif...
I am a recent graduate with a degree in Computer Science and a passion for data engineering.
Graduated with a degree in Computer Science
Strong interest in data engineering
Completed internships in data analysis and database management
Deleting duplicate rows in SQL
Use the DISTINCT keyword in SELECT statement to retrieve unique rows
Use GROUP BY clause to group rows with same values and then use aggregate functions to select one row
Use the ROW_NUMBER() function to assign a unique number to each row and then delete the rows with duplicate numbers
To remove header and trailer from a sequential data file in Datastage.
Use Sequential File stage in Datastage.
Set the 'Skip Rows' property to the number of header rows to be skipped.
Set the 'Trailer Rows' property to the number of trailer rows to be skipped.
Use a Transformer stage to remove any remaining header or trailer rows.
Use the 'Remove' function in the Transformer stage to remove the rows.
To kill a job in Datastage
Stop the job manually from the Director client
Terminate the job from the command line using the dsjob command
Kill the job process from the operating system level
Delete the job from the Datastage repository
To find process id in Linux, use the command 'ps -aux | grep
Open the terminal
Type 'ps -aux' to list all running processes
Use 'grep
The process id (PID) will be listed in the second column
I applied via Recruitment Consultant and was interviewed in Feb 2021. There were 4 interview rounds.
I applied via Recruitment Consulltant
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
I am a Senior Data Engineer with experience in building scalable data pipelines and optimizing data processing workflows.
Experience in designing and implementing ETL processes using tools like Apache Spark and Airflow
Proficient in working with large datasets and optimizing query performance
Strong background in data modeling and database design
Worked on projects involving real-time data processing and streaming analytic
Decorators in Python are functions that modify the behavior of other functions or methods.
Decorators are defined using the @decorator_name syntax before a function definition.
They can be used to add functionality to existing functions without modifying their code.
Decorators can be used for logging, timing, authentication, and more.
Example: @staticmethod decorator in Python is used to define a static method in a class.
SQL query to group by employee ID and combine first name and last name with a space
Use the GROUP BY clause to group by employee ID
Use the CONCAT function to combine first name and last name with a space
Select employee ID, CONCAT(first_name, ' ', last_name) AS full_name
Constructors in Python are special methods used for initializing objects. They are called automatically when a new instance of a class is created.
Constructors are defined using the __init__() method in a class.
They are used to initialize instance variables of a class.
Example: class Person: def __init__(self, name, age): self.name = name self.age = age person1 = Person('Alice', 30)
Indexing in SQL is a technique used to improve the performance of queries by creating a data structure that allows for faster retrieval of data.
Indexes are created on columns in a database table to speed up the retrieval of rows that match a certain condition in a WHERE clause.
Indexes can be created using CREATE INDEX statement in SQL.
Types of indexes include clustered indexes, non-clustered indexes, unique indexes, an...
Spark works well with Parquet files due to its columnar storage format, efficient compression, and ability to push down filters.
Parquet files are columnar storage format, which aligns well with Spark's processing model of working on columns rather than rows.
Parquet files support efficient compression, reducing storage space and improving read performance in Spark.
Spark can push down filters to Parquet files, allowing f...
I applied via Recruitment Consulltant and was interviewed in Nov 2024. There were 2 interview rounds.
Different types of joins available in Databricks include inner join, outer join, left join, right join, and cross join.
Inner join: Returns only the rows that have matching values in both tables.
Outer join: Returns all rows when there is a match in either table.
Left join: Returns all rows from the left table and the matched rows from the right table.
Right join: Returns all rows from the right table and the matched rows ...
Implementing fault tolerance in a data pipeline involves redundancy, monitoring, and error handling.
Use redundant components to ensure continuous data flow
Implement monitoring tools to detect failures and bottlenecks
Set up automated alerts for immediate response to issues
Design error handling mechanisms to gracefully handle failures
Use checkpoints and retries to ensure data integrity
AutoLoader is a feature in data engineering that automatically loads data from various sources into a data warehouse or database.
Automates the process of loading data from different sources
Reduces manual effort and human error
Can be scheduled to run at specific intervals
Examples: Apache Nifi, AWS Glue
To connect to different services in Azure, you can use Azure SDKs, REST APIs, Azure Portal, Azure CLI, and Azure PowerShell.
Use Azure SDKs for programming languages like Python, Java, C#, etc.
Utilize REST APIs to interact with Azure services programmatically.
Access and manage services through the Azure Portal.
Leverage Azure CLI for command-line interface interactions.
Automate tasks using Azure PowerShell scripts.
Linked Services are connections to external data sources or destinations in Azure Data Factory.
Linked Services define the connection information needed to connect to external data sources or destinations.
They can be used in Data Factory pipelines to read from or write to external systems.
Examples of Linked Services include Azure Blob Storage, Azure SQL Database, and Amazon S3.
I applied via Recruitment Consulltant and was interviewed in Nov 2024. There were 2 interview rounds.
I have a background in data analysis with experience in using tools like Python, SQL, and Tableau.
I have a degree in Statistics and have worked as a Data Analyst for 3 years.
My daily activities include cleaning and analyzing data, creating visualizations, and presenting insights to stakeholders.
I use Python for data manipulation and analysis, SQL for querying databases, and Tableau for creating interactive dashboards.
I...
Advanced Excel and Power BI are tools used for data analysis and visualization in companies and for clients.
Advanced Excel allows for complex data manipulation, analysis, and visualization using features like pivot tables, macros, and VBA programming.
Power BI is a business analytics tool that provides interactive visualizations and business intelligence capabilities, connecting to various data sources.
These tools are u...
I have extensive experience in using Advanced Excel and Power BI for data analysis projects.
Created complex formulas and macros in Excel to automate data processing tasks
Designed interactive dashboards in Power BI to visualize and analyze data trends
Integrated data from multiple sources into Power BI for comprehensive analysis
Used Power Query and Power Pivot in Excel to manipulate and analyze large datasets
Provided dat...
Credit and operations concepts in relation to KYC procedures and client data privacy.
Credit refers to the extension of money or resources to a client based on their financial history and ability to repay.
Operations involve the day-to-day processes and procedures within a financial institution to ensure smooth functioning.
KYC procedures are used to verify the identity of clients to prevent fraud and money laundering.
Pri...
I applied via LinkedIn and was interviewed in Dec 2024. There were 3 interview rounds.
Related to Statistics
Related Excel and SQl
I applied via Company Website
Prepare verval, LRDI, Quant moderate like using RS agarwal best book for prepare aptitude
Practice pseudo code minmum 100 pseudo code and for coding using code chef platform best for preparation
based on 2 reviews
Rating in categories
Consultant
55.3k
salaries
| ₹5.2 L/yr - ₹18 L/yr |
Associate Consultant
52k
salaries
| ₹2.9 L/yr - ₹11.8 L/yr |
Senior Consultant
46k
salaries
| ₹7.4 L/yr - ₹24 L/yr |
Senior Analyst
20.5k
salaries
| ₹2 L/yr - ₹7.5 L/yr |
Senior Software Engineer
19.9k
salaries
| ₹3.5 L/yr - ₹12.5 L/yr |
Wipro
Accenture
Cognizant
TCS