i
Impetus Technologies
Filter interviews by
Clear (1)
I applied via Naukri.com and was interviewed in Jun 2022. There were 4 interview rounds.
Explaining partitioning, bucketing, joins, optimization, broadcast variables, ORC vs Parquet, RDD vs Dataframe, project architecture and responsibilities for Big Data Engineer role.
Partitioning is dividing data into smaller chunks for parallel processing, while bucketing is organizing data into buckets based on a hash function.
Types of joins in Spark include inner join, outer join, left join, right join, and full outer...
Remove duplicate records and find 5th highest salary department wise using SQL.
Use DISTINCT keyword to remove duplicate records.
Use GROUP BY clause to group the records by department.
Use ORDER BY clause to sort the salaries in descending order.
Use LIMIT clause to get the 5th highest salary.
Combine all the above clauses to get the desired result.
I applied via Naukri.com and was interviewed in Jul 2021. There was 1 interview round.
Spark can handle upserts using merge() function
Use merge() function to handle upserts in Spark
Specify the primary key column(s) to identify matching rows
Specify the update column(s) to update existing rows
Specify the insert column(s) to insert new rows
Example: df1.merge(df2, on='id', whenMatched='update', whenNotMatched='insert')
I applied via Naukri.com and was interviewed in Jun 2021. There were 4 interview rounds.
Java collection is a single interface while collections is a utility class.
Java collection is an interface that provides a unified architecture for manipulating and storing groups of objects.
Collections is a utility class that provides static methods for working with collections.
Java collection is a part of the Java Collections Framework while collections is not.
Examples of Java collections include List, Set, and Map w...
Spark memory optimisation techniques
Use broadcast variables to reduce memory usage
Use persist() or cache() to store RDDs in memory
Use partitioning to reduce shuffling and memory usage
Use off-heap memory to avoid garbage collection overhead
Tune memory settings such as spark.driver.memory and spark.executor.memory
Hadoop serialisation techniques are used to convert data into a format that can be stored and processed in Hadoop.
Hadoop uses Writable interface for serialisation and deserialisation of data
Avro, Thrift, and Protocol Buffers are popular serialisation frameworks used in Hadoop
Serialisation can be customised using custom Writable classes or external libraries
Serialisation plays a crucial role in Hadoop performance and ef
I applied via Naukri.com and was interviewed in Jan 2021. There were 5 interview rounds.
Impetus Technologies interview questions for designations
Top trending discussions
I was interviewed before Sep 2020.
Round duration - 140 minutes
Round difficulty - Medium
Test timing was at 2:00 pm , it was conducted in a college and the environment was good for the test. Camera was a primary part of test, so no suspicious activity.
Given two numbers in the form of two arrays where each element of the array represents a digit, calculate the sum of these two numbers and return this sum as an ar...
Given two numbers represented as arrays, calculate their sum and return the result as an array.
Iterate through the arrays from right to left, adding digits and carrying over if necessary
Handle cases where one array is longer than the other by considering the remaining digits
Ensure the final sum array does not have any leading zeros
Round duration - 20 minutes
Round difficulty - Easy
The round was conducted at around 12 p.m. I was called at the college location and then it was conducted. The interviewer was quite polite and frank.
Round duration - 8 minutes
Round difficulty - Easy
This round was conducted right after finishing and clearing the technical round at the same place and on the same day.
Tip 1 : Practice atleast 2-3 Coding problems daily so your logic building becomes stronger.
Tip 2 : Exercise problems based on OOPS concepts and others too.
Tip 3 : If you can have your own project built, then it's the major point and will act as a plus point.
Tip 1 : Your resume should be in standard form, short and simple will be more effective.
Tip 2 : Whatever you have learned, you need to mention it in your resume as that will be your primary source of selection and having project on your resume is important.
I applied via Campus Placement and was interviewed before Feb 2020. There were 4 interview rounds.
I applied via Campus Placement and was interviewed in Dec 2020. There was 1 interview round.
I applied via Campus Placement and was interviewed before Jun 2020. There were 3 interview rounds.
I applied via Job Portal and was interviewed before Jan 2021. There were 3 interview rounds.
I applied via Other and was interviewed before Nov 2020. There was 1 interview round.
Big data refers to large and complex data sets that cannot be processed using traditional data processing tools.
Big data is characterized by the 3Vs - volume, velocity, and variety.
It requires specialized tools and technologies such as Hadoop, Spark, and NoSQL databases.
Big data is used in various industries such as healthcare, finance, and retail to gain insights and make data-driven decisions.
Some of the top questions asked at the Impetus Technologies Big Data Engineer interview -
based on 6 reviews
Rating in categories
Senior Software Engineer
701
salaries
| ₹0 L/yr - ₹0 L/yr |
Software Engineer
540
salaries
| ₹0 L/yr - ₹0 L/yr |
Module Lead Software Engineer
276
salaries
| ₹0 L/yr - ₹0 L/yr |
Module Lead
249
salaries
| ₹0 L/yr - ₹0 L/yr |
Lead Software Engineer
201
salaries
| ₹0 L/yr - ₹0 L/yr |
Persistent Systems
TCS
Infosys
Wipro