Add office photos
Engaged Employer

Sigmoid

3.3
based on 97 Reviews
Filter interviews by

Klenzaids Interview Questions and Answers

Updated 10 Oct 2024
Popular Designations

Q1. inferschema in pyspark when reading file

Ans.

inferschema in pyspark is used to automatically infer the schema of a file when reading it.

  • inferschema is a parameter in pyspark that can be set to true when reading a file to automatically infer the schema based on the data

  • It is useful when the schema of the file is not known beforehand

  • Example: df = spark.read.csv('file.csv', header=True, inferSchema=True)

Add your answer

Q2. what is scd in dw?

Ans.

SCD stands for Slowly Changing Dimension in Data Warehousing.

  • SCD is a technique used in data warehousing to track changes to dimension data over time.

  • There are different types of SCDs - Type 1, Type 2, and Type 3.

  • Type 1 SCD overwrites old data with new data, Type 2 creates new records for changes, and Type 3 maintains both old and new values in separate columns.

  • Example: In a customer dimension table, if a customer changes their address, a Type 2 SCD would create a new record ...read more

Add your answer

Q3. optimizing techniques in spark

Ans.

Optimizing techniques in Spark involve partitioning, caching, and tuning resources for efficient data processing.

  • Use partitioning to distribute data evenly across nodes for parallel processing

  • Cache frequently accessed data in memory to avoid recomputation

  • Tune resources such as memory allocation and parallelism settings for optimal performance

Add your answer

Q4. repartition vs coalesce

Ans.

Repartition is used to increase the number of partitions in a DataFrame, while coalesce is used to decrease the number of partitions.

  • Repartition involves shuffling data across the network, which can be expensive in terms of performance and resources.

  • Coalesce is a more efficient operation as it minimizes data movement by only merging existing partitions.

  • Repartition is typically used when there is a need for more parallelism or to evenly distribute data for better performance.

  • C...read more

Add your answer
Discover Klenzaids interview dos and don'ts from real experiences

Q5. normalization in db and types

Ans.

Normalization in databases is the process of organizing data in a database to reduce redundancy and improve data integrity.

  • Normalization is used to eliminate redundant data and ensure data integrity.

  • It involves breaking down a table into smaller tables and defining relationships between them.

  • There are different normal forms such as 1NF, 2NF, 3NF, and BCNF.

  • Normalization helps in reducing data redundancy and improving query performance.

  • Example: In a database, instead of storing...read more

Add your answer

Q6. transformation vs action

Ans.

Transformation involves changing the data structure, while action involves performing a computation on the data.

  • Transformation changes the data structure without executing any computation

  • Action performs a computation on the data and triggers the execution

  • Examples of transformation include map, filter, and reduce in Spark or Pandas

  • Examples of action include count, collect, and saveAsTextFile in Spark

Add your answer

Q7. rank vs dense rank

Ans.

Rank assigns unique ranks to each distinct value, while dense rank assigns ranks without gaps.

  • Rank function assigns unique ranks to each distinct value in a result set.

  • Dense rank function assigns ranks to rows in a result set without any gaps between the ranks.

  • Rank function may skip ranks if there are ties in values, while dense rank will not skip ranks.

Add your answer
Contribute & help others!
Write a review
Share interview
Contribute salary
Add office photos
Interview Tips & Stories
Ace your next interview with expert advice and inspiring stories

Top Senior Data Engineer Interview Questions from Similar Companies

3.6
 • 34 Interview Questions
3.7
 • 22 Interview Questions
3.5
 • 14 Interview Questions
View all
Share an Interview
Stay ahead in your career. Get AmbitionBox app
qr-code
Helping over 1 Crore job seekers every month in choosing their right fit company
70 Lakh+

Reviews

5 Lakh+

Interviews

4 Crore+

Salaries

1 Cr+

Users/Month

Contribute to help millions
Get AmbitionBox app

Made with ❤️ in India. Trademarks belong to their respective owners. All rights reserved © 2024 Info Edge (India) Ltd.

Follow us
  • Youtube
  • Instagram
  • LinkedIn
  • Facebook
  • Twitter