Filter interviews by
I applied via Approached by Company
Setting up data engineering solution using GCP involves utilizing various GCP services for data processing and storage.
Utilize Google Cloud Storage for storing raw data
Use Google BigQuery for data processing and analysis
Implement data pipelines using Google Cloud Dataflow or Apache Beam
Leverage Google Cloud Pub/Sub for real-time data streaming
Utilize Google Cloud Composer for orchestrating data workflows
Top trending discussions
Spark architecture refers to the structure of Apache Spark, a distributed computing framework.
Spark architecture consists of a cluster manager, worker nodes, and a driver program.
The cluster manager allocates resources and schedules tasks across worker nodes.
Worker nodes execute tasks in parallel and store data in memory or disk.
The driver program coordinates the execution of tasks and manages the overall workflow.
Spar...
Optimizing Spark jobs involves tuning configurations, partitioning data, using appropriate data structures, and leveraging caching.
Tune Spark configurations for optimal performance
Partition data to distribute workload evenly
Use appropriate data structures like DataFrames or Datasets
Leverage caching to avoid recomputation
Optimize shuffle operations to reduce data movement
It was also very easy.
Dynamic data frame in AWS Glue job is a dynamically generated data frame, while Spark data frame is specifically created using Spark APIs.
Dynamic data frame is generated dynamically at runtime based on the data source and schema, while Spark data frame is explicitly created using Spark APIs.
Dynamic data frame is more flexible but may have performance implications compared to Spark data frame.
You can convert a dynamic d...
I applied via Naukri.com and was interviewed in Aug 2020. There were 4 interview rounds.
I applied via Naukri.com and was interviewed in Dec 2024. There were 2 interview rounds.
Django applies migrations to the database using the 'manage.py migrate' command.
Django tracks changes to models and generates migration files accordingly.
The 'manage.py makemigrations' command creates migration files based on model changes.
The 'manage.py migrate' command applies the generated migration files to the database.
Migrations help keep the database schema in sync with the changes in Django models.
Hoisting in JavaScript is the behavior where variable and function declarations are moved to the top of their containing scope during the compilation phase.
Variable declarations are hoisted to the top of their scope, but not their assignments.
Function declarations are fully hoisted, meaning they can be called before they are declared.
Hoisting can lead to unexpected behavior if not understood properly.
Create a full stack application in 3 days.
I applied via Naukri.com and was interviewed in Dec 2024. There were 4 interview rounds.
Set of questions on english , aptitude , all are at easy level
Sql basics and some query questions
I applied via Company Website and was interviewed in Aug 2024. There were 2 interview rounds.
Uber data model design for efficient storage and retrieval of ride-related information.
Create tables for users, drivers, rides, payments, and ratings
Include attributes like user_id, driver_id, ride_id, payment_id, rating_id, timestamp, location, fare, etc.
Establish relationships between tables using foreign keys
Implement indexing for faster query performance
I applied via Newspaper Ad and was interviewed in Aug 2024. There were 3 interview rounds.
Three sections are there 1) Aptitude Test 2) SQL 3) DSA
DSA stands for Data Structures and Algorithms. Sorting is the process of arranging data in a particular order. Array is a data structure that stores elements of the same data type in contiguous memory locations, while linked list is a data structure that stores elements in nodes with pointers to the next node.
DSA stands for Data Structures and Algorithms
Sorting is the process of arranging data in a particular order
Arra...
I have experience working on various data analysis projects, including market research, customer segmentation, and predictive modeling.
Developed predictive models to forecast customer behavior and optimize marketing strategies
Conducted market research to identify trends and opportunities for growth
Performed customer segmentation analysis to target specific demographics with personalized marketing campaigns
I applied via Naukri.com and was interviewed in May 2024. There was 1 interview round.
PySpark architecture is a distributed computing framework that combines Python and Spark to process big data.
PySpark architecture consists of a driver program, cluster manager, and worker nodes.
The driver program is responsible for creating SparkContext, which connects to the cluster manager.
Cluster manager allocates resources and schedules tasks on worker nodes.
Worker nodes execute the tasks and return results to the ...
Skewed partitioning is when data is not evenly distributed across partitions, leading to performance issues.
Skewed partitioning can occur when a key column has a few values that are much more common than others.
It can lead to uneven processing and resource utilization in distributed systems like Hadoop or Spark.
To address skewed partitioning, techniques like data skew detection, data skew handling, and data skew preven
Interview experience
Accenture
Wipro
TCS
Infosys