Tiger Analytics
SI-UK Interview Questions and Answers
Q1. Explain databricks dlt, and when will you use batch vs streaming?
Databricks DLT is a unified data management platform for batch and streaming processing.
Databricks DLT (Delta Lake Table) is a storage layer that brings ACID transactions to Apache Spark and big data workloads.
Batch processing is used when data is collected over a period of time and processed in large chunks, while streaming processing is used for real-time data processing.
Use batch processing for historical data analysis, ETL jobs, and periodic reporting. Use streaming proce...read more
Q2. Different type of license in power bi. Data Modelling.
Power BI offers different types of licenses for data modeling, including Power BI Pro and Power BI Premium.
Power BI Pro license allows users to create and share reports and dashboards with others.
Power BI Premium license offers additional features such as larger data capacity and advanced AI capabilities.
Power BI Embedded license is designed for embedding reports and dashboards into custom applications.
Power BI Report Server license allows for on-premises report publishing an...read more
Q3. What is the difference between deltalake and delta warehouse
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads, while Delta Warehouse is a cloud-based data warehouse service.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark and big data workloads.
Delta Warehouse is a cloud-based data warehouse service that provides scalable storage and analytics capabilities.
Delta Lake is more focused on data lake operations and ensuring data reliabilit...read more
Q4. most frequent word in a sentence ?
The most frequent word in a sentence can be found by counting the occurrence of each word and selecting the one with the highest count.
Split the sentence into words using whitespace as delimiter
Create a dictionary to store the count of each word
Iterate through the words and update the count in the dictionary
Find the word with the highest count in the dictionary
Q5. Expected ctc and current ctc negotiations
Discussing expected and current salary for negotiation purposes.
Be honest about your current salary and provide a realistic expectation for your desired salary.
Highlight your skills and experience that justify your desired salary.
Be open to negotiation and willing to discuss other benefits besides salary.
Research industry standards and salary ranges for similar positions to support your negotiation.
Focus on the value you can bring to the company rather than just the monetary ...read more
Q6. What is indexing in SQl
Indexing in SQL is a technique to improve the performance of queries by creating a data structure that allows for faster retrieval of data.
Indexes are created on columns in a database table to speed up the retrieval of data.
They work similar to the index in a book, allowing the database to quickly find the rows that match a certain value.
Indexes can be created using single or multiple columns.
Examples: CREATE INDEX index_name ON table_name(column_name);
Q7. Design round for adf pipeline
Designing an ADF pipeline for data processing
Identify data sources and destinations
Define data transformations and processing steps
Consider scheduling and monitoring requirements
Utilize ADF activities like Copy Data, Data Flow, and Databricks
Implement error handling and logging mechanisms
Q8. Explain spark architecture
Spark architecture is a distributed computing framework that consists of a driver program, cluster manager, and worker nodes.
Consists of a driver program that manages the execution of tasks
Utilizes a cluster manager to allocate resources and schedule tasks
Worker nodes execute the tasks and store data in memory or disk
Supports fault tolerance through resilient distributed datasets (RDDs)
Interview Process at SI-UK
Top Data Engineer Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month