Filter interviews by
I applied via Campus Placement and was interviewed before Oct 2022. There were 3 interview rounds.
Basic coding was asked
Top trending discussions
posted on 14 Dec 2024
I was interviewed in Nov 2024.
Use 'hdfs diskbalancer' command to check disk utilisation and health in Hadoop
Run 'hdfs diskbalancer -report' to get a report on disk utilisation
Use 'hdfs diskbalancer -plan <path>' to generate a plan for balancing disk usage
Check the Hadoop logs for any disk health issues
Spark Architecture consists of Driver, Cluster Manager, and Executors. Driver manages the execution of Spark jobs.
Driver: Manages the execution of Spark jobs, converts user code into tasks, and coordinates with Cluster Manager.
Cluster Manager: Manages resources across the cluster and allocates resources to Spark applications.
Executors: Execute tasks assigned by the Driver and store data in memory or disk for further pr...
Optimization techniques in Spark improve performance and efficiency of data processing.
Partitioning data to distribute workload evenly
Caching frequently accessed data in memory
Using broadcast variables for small lookup tables
Avoiding shuffling operations whenever possible
Tuning memory settings and garbage collection parameters
I am unable to provide this information as it is confidential.
Confidential information about salaries in previous organizations should not be disclosed.
It is important to respect the privacy and confidentiality of past employers.
Discussing specific salary details may not be appropriate in a professional setting.
To create a pivot table in SQL from a non-pivot table, you can use the CASE statement with aggregate functions.
Use the CASE statement to categorize data into columns
Apply aggregate functions like SUM, COUNT, AVG, etc. to calculate values for each category
Group the data by the columns you want to pivot on
Creating triggers in a database involves defining the trigger, specifying the event that will activate it, and writing the code to be executed.
Define the trigger using the CREATE TRIGGER statement
Specify the event that will activate the trigger (e.g. INSERT, UPDATE, DELETE)
Write the code or actions to be executed when the trigger is activated
Test the trigger to ensure it functions as intended
I applied via Referral and was interviewed in Dec 2024. There were 2 interview rounds.
30 Questions in 20 Minutes
posted on 11 Jun 2024
I applied via Job Portal and was interviewed in May 2024. There was 1 interview round.
PySpark architecture is based on the Apache Spark architecture, with additional components for Python integration.
PySpark architecture includes Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX.
It allows Python developers to interact with Spark using PySpark API.
PySpark architecture enables distributed processing of large datasets using RDDs and DataFrames.
It leverages the power of in-memory processing for fast...
I applied via Naukri.com and was interviewed in Jul 2023. There were 2 interview rounds.
Spark internal working and optimization techniques
Spark uses Directed Acyclic Graph (DAG) for optimizing workflows
Lazy evaluation helps in optimizing transformations by combining them into a single stage
Caching and persistence of intermediate results can improve performance
Partitioning data can help in parallel processing and reducing shuffle operations
I applied via LinkedIn and was interviewed in Jun 2023. There were 4 interview rounds.
I applied via Naukri.com
I have worked on various AWS services including S3, EC2, Lambda, Glue, and Redshift.
S3 - Used for storing and retrieving data
EC2 - Used for running virtual servers
Lambda - Used for serverless computing
Glue - Used for ETL (Extract, Transform, Load) processes
Redshift - Used for data warehousing and analytics
I applied via Company Website and was interviewed in Mar 2024. There was 1 interview round.
I applied via Approached by Company and was interviewed in Jun 2023. There were 3 interview rounds.
Spark internal flow involves job submission, DAG creation, task scheduling, and execution.
Job submission: User submits a Spark job to the SparkContext.
DAG creation: SparkContext creates a Directed Acyclic Graph (DAG) of the job.
Task scheduling: DAGScheduler breaks the DAG into stages and tasks, which are scheduled by TaskScheduler.
Task execution: Executors execute the tasks and return results to the driver.
Result fetch...
Hive meta store stores metadata about Hive tables, partitions, columns, and storage location.
Hive meta store is a central repository that stores metadata information about Hive tables, partitions, columns, and storage location.
It stores this metadata in a relational database like MySQL, Derby, or PostgreSQL.
The metadata includes information such as table names, column names, data types, file formats, and storage locati...
SQL query for simple tables
Use SELECT statement to retrieve data
Specify the columns you want to select
Use FROM clause to specify the tables you are querying from
Add WHERE clause to filter the results if needed
Basic logical and programming questions
Interview experience
based on 5 reviews
Rating in categories
Software Developer
549
salaries
| ₹4.4 L/yr - ₹15.1 L/yr |
Software Development Engineer
521
salaries
| ₹4.5 L/yr - ₹13.2 L/yr |
Assistant Manager
463
salaries
| ₹3 L/yr - ₹10 L/yr |
Product Manager
417
salaries
| ₹11.9 L/yr - ₹40 L/yr |
Deputy Manager
350
salaries
| ₹5 L/yr - ₹19.4 L/yr |
Jio
Reliance Industries
Bharti Airtel
Vodafone Idea