Add office photos
Employer?
Claim Account for FREE

Virtusa Consulting Services

3.8
based on 4.4k Reviews
Filter interviews by

Interview Questions and Answers

Updated 14 Dec 2024
Popular Designations

Q1. what type of filesystem used in ur project

Ans.

We use Hadoop Distributed File System (HDFS) for our project.

  • HDFS is a distributed file system designed to run on commodity hardware.

  • It provides high-throughput access to application data and is fault-tolerant.

  • HDFS is used by many big data processing frameworks like Hadoop, Spark, etc.

  • It stores data in a distributed manner across multiple nodes in a cluster.

  • HDFS is optimized for large files and sequential reads and writes.

Add your answer

Q2. Command to check disk utilisation and health in Hadoop

Ans.

Use 'hdfs diskbalancer' command to check disk utilisation and health in Hadoop

  • Run 'hdfs diskbalancer -report' to get a report on disk utilisation

  • Use 'hdfs diskbalancer -plan ' to generate a plan for balancing disk usage

  • Check the Hadoop logs for any disk health issues

Add your answer

Q3. Pivot table creation in SQL from not pivot one

Ans.

To create a pivot table in SQL from a non-pivot table, you can use the CASE statement with aggregate functions.

  • Use the CASE statement to categorize data into columns

  • Apply aggregate functions like SUM, COUNT, AVG, etc. to calculate values for each category

  • Group the data by the columns you want to pivot on

Add your answer

Q4. How to create triggers

Ans.

Creating triggers in a database involves defining the trigger, specifying the event that will activate it, and writing the code to be executed.

  • Define the trigger using the CREATE TRIGGER statement

  • Specify the event that will activate the trigger (e.g. INSERT, UPDATE, DELETE)

  • Write the code or actions to be executed when the trigger is activated

  • Test the trigger to ensure it functions as intended

Add your answer
Discover null interview dos and don'ts from real experiences

Q5. Spark optimization techniques

Ans.

Optimization techniques in Spark improve performance and efficiency of data processing.

  • Partitioning data to distribute workload evenly

  • Caching frequently accessed data in memory

  • Using broadcast variables for small lookup tables

  • Avoiding shuffling operations whenever possible

  • Tuning memory settings and garbage collection parameters

Add your answer

Q6. Spark optimization techniques

Ans.

Spark optimization techniques improve performance and efficiency of Spark applications.

  • Partitioning data to reduce shuffling

  • Caching frequently used data

  • Using broadcast variables for small data

  • Using efficient data formats like Parquet

  • Tuning memory and CPU usage

  • Using appropriate cluster size

  • Avoiding unnecessary data shuffling

  • Using appropriate serialization formats

  • Using appropriate join strategies

Add your answer
Contribute & help others!
Write a review
Share interview
Contribute salary
Add office photos
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
70 Lakh+

Reviews

5 Lakh+

Interviews

4 Crore+

Salaries

1 Cr+

Users/Month

Contribute to help millions
Get AmbitionBox app

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter