ProArch IT Solutions Interview Questions and Answers

Question 1

Asked in

Q1. Difference between partitioning and bucketing. Types of joins in spark Optimization Techniques in spark Broadcast variable and broadcast join Difference between ORC and Parquet Difference between RDD and Datafr...

Add your answer

Answer

Spark can handle upserts using merge() function

Question 3

Asked in

Add your answer

Answer

Remove duplicate records and find 5th highest salary department wise using SQL.

Question 4

Asked in

Add your answer

Answer

Spark memory optimisation techniques

Question 5

Asked in

Add your answer

Answer

Hadoop serialisation techniques are used to convert data into a format that can be stored and processed in Hadoop.

Hadoop uses Writable interface for serialisation and deserialisation of data
Avro, Thrift, and Protocol Buffers are popular serialisation frameworks used in Hadoop
Serialisation can be customised using custom Writable classes or external libraries
Serialisation plays a crucial role in Hadoop performance and efficiency

Question 6

Asked in

Add your answer

Answer

Java collection is a single interface while collections is a utility class.

Java collection is an interface that provides a unified architecture for manipulating and storing groups of objects.
Collections is a utility class that provides static methods for working with collections.
Java collection is a part of the Java Collections Framework while collections is not.
Examples of Java collections include List, Set, and Map while examples of methods in collections include sort, reve...read more