Filter interviews by
Easy to moderate level questions.
I applied via LinkedIn and was interviewed in Mar 2022. There were 2 interview rounds.
I applied via Referral and was interviewed in Jul 2021. There was 1 interview round.
I applied via Recruitment Consultant and was interviewed in May 2021. There were 3 interview rounds.
I applied via LinkedIn and was interviewed in Nov 2022. There were 2 interview rounds.
I applied via Naukri.com and was interviewed in May 2020. There were 3 interview rounds.
I applied via LinkedIn and was interviewed in Mar 2023. There were 4 interview rounds.
Optimisation techniques in Pyspark and sample code for SCD type 2
Use broadcast variables to reduce data shuffling
Partition data based on key columns to improve performance
Use cache() or persist() to avoid recomputing data
Use coalesce() or repartition() to reduce the number of partitions
For SCD type 2, use merge() function to update or insert new records
Example code for SCD type 2: https://github.com/awantik/pyspark-101
I have worked with huge data sets in the past.
I have experience working with data sets ranging from a few gigabytes to several terabytes.
One of the toughest problems I faced was optimizing a query that was taking too long to execute.
I resolved it by analyzing the query plan and identifying the bottleneck, then optimizing the query accordingly.
Another challenge I faced was dealing with data inconsistencies, which I reso...
Our project architecture follows a microservices approach with containerization using Docker and orchestration with Kubernetes.
We have divided our application into smaller, independent services that communicate with each other through APIs.
Each service is containerized using Docker, allowing for easy deployment and scaling.
We use Kubernetes for orchestration, which automates the deployment, scaling, and management of o...
I applied via Naukri.com and was interviewed before Aug 2022. There were 4 interview rounds.
I applied via Approached by Company and was interviewed in Jun 2022. There were 2 interview rounds.
Streams in Java 11 provide a concise and efficient way to handle collections.
Streams allow for functional-style operations on collections.
They can be used to filter, map, and reduce data.
Parallel streams can improve performance on large collections.
Example: List
Example: List
based on 1 review
Rating in categories
Data Engineer
379
salaries
| ₹1.8 L/yr - ₹4.1 L/yr |
Programmer Analyst
20
salaries
| ₹2.3 L/yr - ₹6.2 L/yr |
Software Developer
7
salaries
| ₹1.8 L/yr - ₹2.5 L/yr |
Team Lead
7
salaries
| ₹3.3 L/yr - ₹4 L/yr |
Data Annotation Engineer
6
salaries
| ₹1.8 L/yr - ₹1.8 L/yr |
Ola Electric Mobility
Ather Energy
Hero Electric
Mahindra Last Mile Mobility