Filter interviews by
Partition in Spark is a way to divide data into smaller chunks for parallel processing.
Partitions are basic units of parallelism in Spark
Data in RDDs are divided into partitions which are processed in parallel
Number of partitions can be controlled using repartition() or coalesce() methods
Bucketing is a way of organizing data files into multiple files based on a hash function, while partitioning is dividing data into different directories based on the column values.
Bucketing is used for evenly distributing data across files for better query performance.
Partitioning is used for organizing data based on specific column values for easier data retrieval.
Example: Bucketing can be used to evenly distribute sa...
Hive tables are used to store structured data in Hive, similar to tables in a traditional database.
Hive tables are created using the CREATE TABLE statement.
Tables can be partitioned based on one or more columns.
External tables in Hive store data outside of the default location in HDFS.
Managed tables store data in the default location in HDFS.
Tables can be queried using SQL-like syntax in HiveQL.
Types of read mode in Spark include permissive, dropMalformed, and failFast.
Permissive mode - ignores corrupted records and loads all possible data
DropMalformed mode - drops corrupted records during reading
FailFast mode - fails immediately upon encountering corrupted records
Top trending discussions
I applied via Naukri.com and was interviewed in Oct 2020. There was 1 interview round.
Second level cache is a caching mechanism used to improve performance by storing frequently accessed data in memory.
Second level cache is implemented at the application level and can be configured using frameworks like Hibernate.
To optimize SQL queries, one can use indexes, avoid using SELECT *, and use JOINs instead of subqueries.
SOLID principles are a set of design principles for writing maintainable and scalable cod...
To deploy an application in AWS, you need to create an EC2 instance, configure security groups, install necessary software, and upload your application code.
Create an EC2 instance in the desired region and select the appropriate instance type
Configure security groups to allow traffic to and from the instance
Install necessary software and dependencies on the instance
Upload your application code to the instance
Start the ...
I applied via Company Website and was interviewed before Aug 2020. There were 4 interview rounds.
I applied via Campus Placement and was interviewed in Dec 2020. There were 4 interview rounds.
I applied via Recruitment Consulltant and was interviewed before Jun 2021. There was 1 interview round.
BigInteger is used for mathematical operations involving very large integers in Java.
BigInteger is used when the range of values supported by primitive data types like int and long is not sufficient.
It is commonly used in cryptography and security applications.
It provides methods for arithmetic, bitwise, and logical operations on large integers.
Example: calculating factorial of a large number, generating large prime nu
Merge Sort Algo code in java
Divide the array into two halves
Recursively sort the two halves
Merge the sorted halves
Time complexity: O(n log n)
I applied via Naukri.com and was interviewed in Aug 2020. There was 1 interview round.
I applied via Walk-in and was interviewed in Aug 2020. There were 4 interview rounds.
I applied via Referral and was interviewed in Mar 2021. There was 1 interview round.
based on 1 interview
Interview experience
TCS
Accenture
Wipro
Cognizant