Filter interviews by
I applied via Naukri.com and was interviewed in Jun 2022. There were 2 interview rounds.
Data Management involves the process of collecting, storing, organizing, maintaining, and utilizing data effectively.
Data Management is the process of ensuring that data is accurate, complete, and consistent.
It involves the creation of policies and procedures for data usage, storage, and maintenance.
Data Governance is a subset of Data Management that focuses on the management of data assets and their usage.
Big Data ref...
Data management involves the process of collecting, storing, processing, and maintaining data, while data governance is the overall management of the availability, usability, integrity, and security of data.
Data management focuses on the technical aspects of handling data, while data governance focuses on the policies, procedures, and standards for managing data
Data management involves tasks such as data entry, data cl...
Big Data refers to large and complex data sets that cannot be processed using traditional data processing methods. The 4V's of data are Volume, Velocity, Variety, and Veracity.
Volume: Refers to the amount of data being generated and stored.
Velocity: Refers to the speed at which data is being generated and processed.
Variety: Refers to the different types of data being generated, including structured, semi-structured, an...
I applied via Naukri.com and was interviewed in Sep 2024. There were 4 interview rounds.
Basic aptitude questions
Data structure and algorithms
I applied via Walk-in and was interviewed in Apr 2024. There were 3 interview rounds.
Lazy evaluation in Spark delays the execution of transformations until an action is called.
Lazy evaluation allows Spark to optimize the execution plan by combining multiple transformations into a single stage.
Transformations are not executed immediately, but are stored as a directed acyclic graph (DAG) of operations.
Actions trigger the execution of the DAG and produce results.
Example: map() and filter() are transformat...
MapReduce is a programming model and processing technique for parallel and distributed computing.
MapReduce is used to process large datasets in parallel across a distributed cluster of computers.
It consists of two main functions - Map function for processing key/value pairs and Reduce function for aggregating the results.
Popularly used in big data processing frameworks like Hadoop for tasks like data sorting, searching...
Skewness is a measure of asymmetry in a distribution. Skewed tables are tables with imbalanced data distribution.
Skewness is a statistical measure that describes the asymmetry of the data distribution around the mean.
Positive skewness indicates a longer tail on the right side of the distribution, while negative skewness indicates a longer tail on the left side.
Skewed tables in data engineering refer to tables with imba...
Spark is a distributed computing framework designed for big data processing.
Spark is built around the concept of Resilient Distributed Datasets (RDDs) which allow for fault-tolerant parallel processing of data.
It provides high-level APIs in Java, Scala, Python, and R for ease of use.
Spark can run on top of Hadoop, Mesos, Kubernetes, or in standalone mode.
It includes modules for SQL, streaming, machine learning, and gra...
I applied via Referral and was interviewed in Oct 2024. There was 1 interview round.
Basic aptitude test like distance problem , age etc
I applied via Naukri.com and was interviewed in Mar 2024. There were 3 interview rounds.
Error handling in PySpark involves using try-except blocks and logging to handle exceptions and errors.
Use try-except blocks to catch and handle exceptions in PySpark code
Utilize logging to record errors and exceptions for debugging purposes
Consider using the .option('mode', 'PERMISSIVE') method to handle corrupt records in data processing
I was asked Python, sql, coding questions
Case study on how would you identify the total number of footfall on a airport
I applied via Naukri.com and was interviewed in Apr 2024. There were 2 interview rounds.
Detailed interview on SQL, Tableau & Alteryx
Three sections of apti, ML and Case study
Factors such as foot traffic, proximity to banks, crime rates, and demographics should be considered for ATM placements in a city.
Foot traffic in the area
Proximity to banks or financial institutions
Crime rates in the neighborhood
Demographics of the area (income levels, age groups)
Accessibility and visibility of the location
Local regulations and zoning laws
Availability of power and network connections
Competition from ot...
based on 2 reviews
Rating in categories
Assistant Vice President
4.6k
salaries
| ₹17 L/yr - ₹47.5 L/yr |
Assistant Manager
3.3k
salaries
| ₹6 L/yr - ₹20 L/yr |
Officer
2.8k
salaries
| ₹10.1 L/yr - ₹35 L/yr |
Vice President
2.4k
salaries
| ₹24 L/yr - ₹70 L/yr |
Manager
2.3k
salaries
| ₹9.4 L/yr - ₹37 L/yr |
State Bank of India
HDFC Bank
ICICI Bank
Axis Bank