Add office photos
Employer?
Claim Account for FREE

EPAM Systems

3.7
based on 1.4k Reviews
Filter interviews by

10+ MResult Services Interview Questions and Answers

Updated 22 Nov 2024
Popular Designations

Q1. Write code for printing duplicate numbers in a list.

Ans.

Code to print duplicate numbers in a list.

  • Iterate through the list and keep track of the count of each number using a dictionary.

  • Print the numbers that have a count greater than 1.

View 2 more answers

Q2. Write code to print reverse of a sentence word by word.

Ans.

Code to print reverse of a sentence word by word.

  • Split the sentence into words using space as delimiter

  • Store the words in an array

  • Print the words in reverse order

Add your answer

Q3. Difference between cache and persist, repartition and coalesce.

Ans.

Cache and persist are used to store data in memory. Repartition and coalesce are used to change the number of partitions.

  • Cache stores the data in memory for faster access while persist allows the user to choose the storage level.

  • Repartition increases the number of partitions while coalesce decreases the number of partitions.

  • Cache and persist are transformations while repartition and coalesce are actions.

  • Cache and persist are used for iterative algorithms while repartition and...read more

Add your answer

Q4. Elaboration of Spark optimization techniques. Types of transformations, shuffling.

Ans.

Spark optimization techniques include partitioning, caching, and using appropriate transformations.

  • Partitioning data can improve performance by reducing shuffling.

  • Caching frequently used data can reduce the need for recomputation.

  • Transformations like filter, map, and reduceByKey can be used to optimize data processing.

  • Shuffling can be minimized by using operations like reduceByKey instead of groupByKey.

  • Broadcasting small data can improve performance by reducing network traffi...read more

Add your answer
Discover MResult Services interview dos and don'ts from real experiences

Q5. Hive types of tables and difference between them

Ans.

Hive has two types of tables - Managed and External. Managed tables are managed by Hive, while External tables are managed outside of Hive.

  • Managed tables are created using 'CREATE TABLE' command and data is stored in Hive's warehouse directory

  • External tables are created using 'CREATE EXTERNAL TABLE' command and data is stored outside of Hive's warehouse directory

  • Managed tables are deleted when the table is dropped, while External tables are not

  • Managed tables have full control...read more

Add your answer

Q6. Difference between RDD, Dataframe, Dataset.

Ans.

RDD, Dataframe, and Dataset are data structures in Apache Spark with different characteristics and functionalities.

  • RDD (Resilient Distributed Datasets) is a fundamental data structure in Spark that represents an immutable distributed collection of objects. It provides low-level APIs for distributed data processing and fault tolerance.

  • Dataframe is a distributed collection of data organized into named columns. It is similar to a table in a relational database and provides a hig...read more

Add your answer
Are these interview questions helpful?

Q7. Connecting Spark to Azure SQL Database.

Ans.

Spark can connect to Azure SQL Database using JDBC driver.

  • Download and install the JDBC driver for Azure SQL Database.

  • Set up the connection string with the appropriate credentials.

  • Use the JDBC API to connect Spark to Azure SQL Database.

  • Example: val df = spark.read.jdbc(jdbcUrl, tableName, connectionProperties)

  • Ensure that the firewall rules for the Azure SQL Database allow access from the Spark cluster.

Add your answer

Q8. Discuss project and it's architecture.

Ans.

Developed a data pipeline to process and analyze customer behavior data.

  • Used Apache Kafka for real-time data streaming

  • Implemented data processing using Apache Spark

  • Stored data in Hadoop Distributed File System (HDFS)

  • Used Tableau for data visualization

Add your answer
Share interview questions and help millions of jobseekers 🌟

Q9. How will you handle data skewness in spark

Ans.

Data skewness can be handled in Spark by using techniques like partitioning, bucketing, and broadcasting.

  • Partitioning the data based on a key column can distribute the data evenly across the cluster.

  • Bucketing can further divide the data into smaller buckets based on a hash function.

  • Broadcasting small tables can reduce the amount of data shuffled across the network.

  • Using dynamic allocation can also help in handling data skewness by allocating more resources to tasks that are t...read more

Add your answer

Q10. What Azure solutions have you worked with?

Ans.

I have worked with Azure Data Factory, Azure Databricks, and Azure SQL Database.

  • Azure Data Factory for data integration and orchestration

  • Azure Databricks for big data processing and analytics

  • Azure SQL Database for relational database management

Add your answer

Q11. What tech stack are used

Ans.

The tech stack used includes Python, SQL, Apache Spark, Hadoop, AWS, and Docker.

  • Python for data processing and analysis

  • SQL for database querying

  • Apache Spark for big data processing

  • Hadoop for distributed storage and processing

  • AWS for cloud services

  • Docker for containerization

Add your answer

Q12. types of Variables in Scala

Ans.

Scala has two types of variables - mutable and immutable.

  • Scala has mutable variables that can be reassigned using the var keyword.

  • Scala also has immutable variables that cannot be reassigned once they are initialized using the val keyword.

  • Example: var mutableVariable = 10; val immutableVariable = 20;

Add your answer
Contribute & help others!
Write a review
Share interview
Contribute salary
Add office photos

Interview Process at MResult Services

based on 6 interviews
1 Interview rounds
HR Round
View more
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Top Data Engineer Interview Questions from Similar Companies

3.7
 • 40 Interview Questions
3.8
 • 28 Interview Questions
3.4
 • 16 Interview Questions
3.5
 • 13 Interview Questions
3.0
 • 12 Interview Questions
3.5
 • 11 Interview Questions
View all
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
70 Lakh+

Reviews

5 Lakh+

Interviews

4 Crore+

Salaries

1 Cr+

Users/Month

Contribute to help millions

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter