Filter interviews by
RDD stands for Resilient Distributed Dataset and is the fundamental data structure of Apache Spark.
RDD is a distributed collection of objects that can be operated on in parallel.
DataFrames and Datasets are higher-level abstractions built on top of RDDs.
RDDs are more low-level and offer more control over data processing compared to DataFrames and Datasets.
Partitioning is the process of dividing data into smaller chunks for better organization and processing in distributed systems.
Partitioning helps in distributing data across multiple nodes for parallel processing.
Coalesce is used to reduce the number of partitions without shuffling data, while repartition is used to increase the number of partitions by shuffling data.
Example: coalesce(5) will merge partitions into 5 pa...
Spark is a distributed computing framework that provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Spark has a master-slave architecture with a driver program that communicates with a cluster manager to distribute work across worker nodes.
It uses Resilient Distributed Datasets (RDDs) for fault-tolerant distributed data processing.
Spark supports various programming l...
DAG stands for Directed Acyclic Graph. It is a finite directed graph with no cycles.
DAG is a collection of nodes connected by edges where each edge goes from one node to another, but no cycles are allowed.
In the context of Spark, a DAG represents the sequence of transformations that need to be applied to the input data to get the final output.
When a Spark job is submitted, Spark creates a DAG of the transformations spe...
Top trending discussions
posted on 3 Nov 2022
I applied via Campus Placement and was interviewed before Nov 2021. There were 3 interview rounds.
There are 30 basic aptitude question in 30 minutes
3 coding question 2 were easy and 1 was of medium level
I applied via Naukri.com
I applied via Recruitment Consulltant and was interviewed in Feb 2024. There was 1 interview round.
Basic ds algo round with questions asked around arrays strings
I applied via LinkedIn and was interviewed in Jul 2024. There were 2 interview rounds.
Find the minimum element in a rotated sorted array.
Perform binary search to find the pivot point where the array is rotated.
Compare the element at pivot point to the first element to determine which half to search for the minimum.
Continue binary search in the appropriate half to find the minimum element.
API for Instagram application to interact with user data and DB schema to store user information and posts.
API endpoints for user authentication, posting photos, liking photos, following users, etc.
DB schema with tables for users, posts, comments, likes, followers, etc.
Example API endpoint: /users/{userId}/posts to retrieve all posts by a specific user.
Example DB schema: Users table with columns for username, email, pr
I applied via Approached by Company and was interviewed in Mar 2024. There were 2 interview rounds.
I appeared for an interview before Mar 2024, where I was asked the following questions.
A dirty read occurs when a transaction reads data that has been modified but not yet committed by another transaction.
Dirty reads can lead to inconsistencies, as the data may change before the first transaction is committed.
For example, if Transaction A updates a record but hasn't committed yet, and Transaction B reads that record, Transaction B sees uncommitted data.
If Transaction A rolls back, Transaction B will have...
Design patterns are reusable solutions to common software design problems, promoting best practices and improving code maintainability.
Creational patterns (e.g., Singleton, Factory Method) manage object creation.
Structural patterns (e.g., Adapter, Composite) deal with object composition.
Behavioral patterns (e.g., Observer, Strategy) focus on communication between objects.
Design patterns help in code reusability and sca
I applied via LinkedIn and was interviewed in Oct 2021. There were 4 interview rounds.
based on 1 interview
Interview experience
based on 1 review
Rating in categories
Senior Applied Data Scientist
128
salaries
| ₹10.9 L/yr - ₹18 L/yr |
Applied Data Scientist
84
salaries
| ₹9.5 L/yr - ₹15.5 L/yr |
Lead Applied Data Scientist
82
salaries
| ₹17 L/yr - ₹27.8 L/yr |
Lead Engineer
51
salaries
| ₹16 L/yr - ₹54.9 L/yr |
Senior Engineer
49
salaries
| ₹10 L/yr - ₹34 L/yr |
EXL Service
Access Healthcare
S&P Global
Acuity Knowledge Partners