i
Iris Software
Filter interviews by
I applied via Naukri.com and was interviewed in Dec 2024. There were 4 interview rounds.
NA kjwnoi wniowe nfiow flmi
NA fklwmoiwef,m ionfwno njnwfeio onfwp
I applied via LinkedIn and was interviewed in Jul 2024. There were 2 interview rounds.
It was pair programming round where we need to attempt a couple of Spark Scenario along with the Interviewer. You will have a boiler plate code with some functionalities to be filled up. You will be assessed on writing clean and extensible code and test cases.
Python and sql questions
I applied via Approached by Company and was interviewed in Oct 2023. There were 2 interview rounds.
I applied via Referral and was interviewed before Jul 2023. There were 3 interview rounds.
I applied via Naukri.com and was interviewed in Apr 2023. There was 1 interview round.
I applied via Referral and was interviewed before May 2023. There were 2 interview rounds.
Spark is a distributed computing framework that provides in-memory processing capabilities for big data analytics.
Spark has a master-slave architecture with a central coordinator called the Driver and distributed workers called Executors.
It uses Resilient Distributed Datasets (RDDs) for fault-tolerant distributed data processing.
Spark supports various data sources like HDFS, Cassandra, HBase, and S3 for input and outpu...
SQL code for handling various situations in data analysis
Use CASE statements for conditional logic
Use COALESCE function to handle NULL values
Use GROUP BY and HAVING clauses for aggregating data
Use subqueries for complex filtering or calculations
To create a Spark DataFrame, use the createDataFrame() method.
Import the necessary libraries
Create a list of tuples or a dictionary containing the data
Create a schema for the DataFrame
Use the createDataFrame() method to create the DataFrame
I applied via Recruitment Consulltant and was interviewed in Sep 2024. There were 2 interview rounds.
Accumulators are shared variables that are updated by worker nodes and can be used for aggregating information across tasks.
Accumulators are used for implementing counters and sums in Spark.
They are only updated by worker nodes and are read-only by the driver program.
Accumulators are useful for debugging and monitoring purposes.
Example: counting the number of errors encountered during processing.
Spark architecture is a distributed computing framework that consists of a driver program, cluster manager, and worker nodes.
Spark architecture includes a driver program that manages the execution of the Spark application.
It also includes a cluster manager that allocates resources and schedules tasks on worker nodes.
Worker nodes are responsible for executing the tasks and storing data in memory or disk.
Spark architectu...
Query to find duplicate data using SQL
Use GROUP BY and HAVING clause to identify duplicate records
Select columns to check for duplicates
Use COUNT() function to count occurrences of each record
Pub/sub is a messaging pattern where senders (publishers) of messages do not program the messages to be sent directly to specific receivers (subscribers).
Pub/sub stands for publish/subscribe.
Publishers send messages to a topic, and subscribers receive messages from that topic.
It allows for decoupling of components in a system, enabling scalability and flexibility.
Examples include Apache Kafka, Google Cloud Pub/Sub, and
I have used services like BigQuery, Dataflow, Pub/Sub, and Cloud Storage in GCP.
BigQuery for data warehousing and analytics
Dataflow for real-time data processing
Pub/Sub for messaging and event ingestion
Cloud Storage for storing data and files
based on 8 reviews
Rating in categories
Senior Software Engineer
565
salaries
| ₹10 L/yr - ₹32 L/yr |
Technical Lead
526
salaries
| ₹15 L/yr - ₹36.2 L/yr |
Senior Engineer
396
salaries
| ₹9.5 L/yr - ₹32 L/yr |
Senior Technical Consultant
387
salaries
| ₹9.5 L/yr - ₹29 L/yr |
Senior Technology Engineer
321
salaries
| ₹11.2 L/yr - ₹32 L/yr |
TCS
Infosys
Wipro
HCLTech