Filter interviews by
I applied via Recruitment Consulltant and was interviewed before Nov 2020. There were 2 interview rounds.
Yes, I am a quick learner and have the ability to grasp new techniques while working.
I have a strong foundation in data management principles and techniques
I am always eager to learn and stay up-to-date with the latest industry trends
I have experience working with various data management tools and software
I am able to adapt quickly to new environments and work collaboratively with team members
For example, in my previou...
Yes, I am willing to work extra hours after shifts.
I understand that sometimes extra work is necessary to meet deadlines or complete projects.
I am willing to be flexible with my schedule to ensure that tasks are completed on time.
I have experience working overtime and am comfortable doing so.
I am committed to the success of the team and the company, and am willing to put in extra effort to achieve that success.
Top trending discussions
I applied via Company Website and was interviewed in Jun 2022. There was 1 interview round.
A data management plan is a document that outlines how data will be collected, organized, stored, and shared.
A data management plan (DMP) is a formal document that describes the processes and procedures for managing data throughout its lifecycle.
It includes details on data collection, storage, organization, documentation, sharing, and preservation.
A DMP ensures that data is managed effectively, securely, and in complia...
Trauma refers to a deeply distressing or disturbing experience that can have long-lasting psychological and emotional effects.
Trauma is an event or experience that overwhelms a person's ability to cope.
It can result from various sources such as accidents, abuse, violence, or natural disasters.
Trauma can lead to symptoms like flashbacks, nightmares, anxiety, depression, and difficulty in functioning.
Examples of trauma i...
I applied via Walk-in and was interviewed in Dec 2024. There were 5 interview rounds.
Given task Statics standard deviations Attrition Average of given table values and Given graph economi graph and poverty graph base on that need to gave answers 30 qustion and 60 min time duration
I applied via Naukri.com and was interviewed in Nov 2024. There were 2 interview rounds.
The Aptitude Test session accesses mathematical and logical reasoning abilities
Vlookup is a function in Excel used to search for a value in a table and return a corresponding value from another column.
Vlookup stands for 'Vertical Lookup'
It is commonly used in Excel to search for a value in the leftmost column of a table and return a value in the same row from a specified column
Syntax: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
Example: =VLOOKUP(A2, B2:D10, 3, FALSE) - searc...
My day in my previous organization involved analyzing large datasets, creating reports, and presenting findings to stakeholders.
Reviewing and cleaning large datasets to ensure accuracy
Creating visualizations and reports to communicate insights
Collaborating with team members to identify trends and patterns
Presenting findings to stakeholders in meetings or presentations
I possess strong technical skills in data analysis, including proficiency in programming languages, statistical analysis, and data visualization tools.
Proficient in programming languages such as Python, R, SQL
Skilled in statistical analysis and data modeling techniques
Experience with data visualization tools like Tableau, Power BI
Knowledge of machine learning algorithms and techniques
A Pivot Table is a data summarization tool used in spreadsheet programs to analyze, summarize, and present data in a tabular format.
Pivot tables allow users to reorganize and summarize selected columns and rows of data to obtain desired insights.
Users can easily group and filter data, perform calculations, and create visualizations using pivot tables.
Pivot tables are commonly used in Excel and other spreadsheet program...
To find the highest-paid employee in each department, we need to group employees by department and then select the employee with the highest salary in each group.
Group employees by department
Find the employee with the highest salary in each group
Retrieve the employee's name, salary, and department name
I applied via campus placement at All India Institute of Management Studies and was interviewed in Oct 2024. There were 2 interview rounds.
Technology making ours less human ( 15 min) online
Fixed assets are long-term tangible assets that are used in the production of goods or services and are not intended for sale.
Fixed assets are physical assets such as buildings, machinery, equipment, vehicles, and land.
They are not intended for sale and are used for the production of goods or services over a long period of time.
Examples of fixed assets include manufacturing plants, office buildings, delivery trucks, an
Capital assets are long-term assets that are used in the production of goods or services and are not easily converted into cash.
Capital assets are typically tangible assets such as buildings, machinery, equipment, and vehicles.
They are used by a company to generate revenue over an extended period of time.
Examples of capital assets include manufacturing plants, delivery trucks, office furniture, and computer systems.
Current ratio is used to assess a company's ability to pay its short-term obligations with its short-term assets.
Current ratio is a liquidity ratio that measures a company's ability to cover its short-term liabilities with its short-term assets.
It is calculated by dividing current assets by current liabilities.
A current ratio of 1 or higher is generally considered healthy, as it indicates that a company has enough curr...
posted on 17 Jul 2024
I applied via Naukri.com and was interviewed in Aug 2024. There were 2 interview rounds.
I am a Senior Data Engineer with experience in developing data pipelines and optimizing data storage for various projects.
Developed data pipelines using Apache Spark for real-time data processing
Optimized data storage using technologies like Hadoop and AWS S3
Worked on a project to analyze customer behavior and improve marketing strategies
My day-to-day job in the project involved designing and implementing data pipelines, optimizing data workflows, and collaborating with cross-functional teams.
Designing and implementing data pipelines to extract, transform, and load data from various sources
Optimizing data workflows to improve efficiency and performance
Collaborating with cross-functional teams including data scientists, analysts, and business stakeholde...
DAGs handle fault tolerance by rerunning failed tasks and maintaining task dependencies.
DAGs rerun failed tasks automatically to ensure completion.
DAGs maintain task dependencies to ensure proper sequencing.
DAGs can be configured to retry failed tasks a certain number of times before marking them as failed.
Shuffling is the process of redistributing data across partitions in a distributed computing environment.
Shuffling is necessary when data needs to be grouped or aggregated across different partitions.
It can be handled efficiently by minimizing the amount of data being shuffled and optimizing the partitioning strategy.
Techniques like partitioning, combiners, and reducers can help reduce the amount of shuffling in MapRed
Repartition increases or decreases the number of partitions in a DataFrame, while Coalesce only decreases the number of partitions.
Repartition can increase or decrease the number of partitions in a DataFrame, leading to a shuffle of data across the cluster.
Coalesce only decreases the number of partitions in a DataFrame without performing a full shuffle, making it more efficient than repartition.
Repartition is typically...
Incremental data is handled by identifying new data since the last update and merging it with existing data.
Identify new data since last update
Merge new data with existing data
Update data warehouse or database with incremental changes
SCD stands for Slowly Changing Dimension, a concept in data warehousing to track changes in data over time.
SCD is used to maintain historical data in a data warehouse.
There are three types of SCD - Type 1, Type 2, and Type 3.
Type 1 SCD overwrites old data with new data.
Type 2 SCD creates a new record for each change, preserving history.
Type 3 SCD maintains both old and new values in the same record.
SCD is important for...
Reverse a string using SQL and Python codes.
In SQL, use the REVERSE function to reverse a string.
In Python, use slicing with a step of -1 to reverse a string.
Use Spark and SQL to find the top 5 countries with the highest population.
Use Spark to load the data and perform data processing.
Use SQL queries to group by country and sum the population.
Order the results in descending order and limit to top 5.
Example: SELECT country, SUM(population) AS total_population FROM table_name GROUP BY country ORDER BY total_population DESC LIMIT 5
To find different records for different joins using two tables
Use the SQL query to perform different joins like INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN
Identify the key columns in both tables to join on
Select the columns from both tables and use WHERE clause to filter out the different records
A catalyst optimizer is a query optimization tool used in Apache Spark to improve performance by generating an optimal query plan.
Catalyst optimizer is a rule-based query optimization framework in Apache Spark.
It leverages rules to transform the logical query plan into a more optimized physical plan.
The optimizer applies various optimization techniques like predicate pushdown, constant folding, and join reordering.
By o...
Used query optimization techniques to improve performance in database queries.
Utilized indexing to speed up search queries.
Implemented query caching to reduce redundant database calls.
Optimized SQL queries by restructuring joins and subqueries.
Utilized database partitioning to improve query performance.
Used query profiling tools to identify and optimize slow queries.
Use the len() function to check the length of the data frame.
Use len() function to get the number of rows in the data frame.
If the length is 0, then the data frame is empty.
Example: if len(df) == 0: print('Data frame is empty')
Cores and worker nodes are decided based on the workload requirements and scalability needs of the data processing system.
Consider the size and complexity of the data being processed
Evaluate the processing speed and memory requirements of the tasks
Take into account the parallelism and concurrency needed for efficient data processing
Monitor the system performance and adjust cores and worker nodes as needed
Enforcing schema ensures that data conforms to a predefined structure and rules.
Ensures data integrity by validating incoming data against predefined schema
Helps in maintaining consistency and accuracy of data
Prevents data corruption and errors in data processing
Can lead to rejection of data that does not adhere to the schema
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
I am a detail-oriented data entry operator with strong organizational skills and a passion for accuracy.
Experienced in entering data accurately and efficiently
Proficient in using data entry software and tools
Strong attention to detail and ability to spot errors
Excellent organizational skills and ability to prioritize tasks
Ability to work independently and meet deadlines
Specialist
40
salaries
| ₹2.2 L/yr - ₹4.1 L/yr |
Verification Specialist
10
salaries
| ₹3.5 L/yr - ₹4.4 L/yr |
Senior Associate
7
salaries
| ₹1.7 L/yr - ₹3 L/yr |
Process Leader
7
salaries
| ₹3.8 L/yr - ₹6.3 L/yr |
Senior Verification Specialist
6
salaries
| ₹3.6 L/yr - ₹4 L/yr |
Accenture
Capgemini
HCLTech
Teleperformance