Filter interviews by
I applied via Campus Placement
1 good coding question and 33 mcqs
Create a database to store information about colleges, students, and professors.
Create tables for colleges, students, and professors
Include columns for relevant information such as name, ID, courses, etc.
Establish relationships between the tables using foreign keys
Use SQL queries to insert, update, and retrieve data
Consider normalization to avoid data redundancy
I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.
Data bricks is a unified analytics platform that provides a collaborative environment for data scientists, engineers, and analysts.
Data bricks simplifies the process of building data pipelines and training machine learning models.
It allows for easy integration with various data sources and tools, such as Apache Spark and Delta Lake.
Data bricks provides a scalable and secure platform for processing big data and running ...
Optimizing code involves identifying bottlenecks, improving algorithms, using efficient data structures, and minimizing resource usage.
Identify and eliminate bottlenecks in the code by profiling and analyzing performance.
Improve algorithms by using more efficient techniques and data structures.
Use appropriate data structures like hash maps, sets, and arrays to optimize memory usage and access times.
Minimize resource us...
SQL window function is used to perform calculations across a set of table rows related to the current row.
Window functions operate on a set of rows related to the current row
They can be used to calculate running totals, moving averages, rank, etc.
Examples include ROW_NUMBER(), RANK(), SUM() OVER(), etc.
Half hour with spark python azure databricks
I applied via Naukri.com and was interviewed before Dec 2023. There were 2 interview rounds.
Tredence interview questions for designations
I applied via Naukri.com and was interviewed in Sep 2023. There was 1 interview round.
I have used activities such as Copy Data, Execute Pipeline, Lookup, and Data Flow in Data Factory.
Copy Data activity is used to copy data from a source to a destination.
Execute Pipeline activity is used to trigger another pipeline within the same or different Data Factory.
Lookup activity is used to retrieve data from a specified dataset or table.
Data Flow activity is used for data transformation and processing.
To execute a second notebook from the first notebook, you can use the %run magic command in Jupyter Notebook.
Use the %run magic command followed by the path to the second notebook in the first notebook.
Ensure that the second notebook is in the same directory or provide the full path to the notebook.
Make sure to save any changes in the second notebook before executing it from the first notebook.
Data lake storage is optimized for big data analytics and can store structured, semi-structured, and unstructured data. Blob storage is for unstructured data only.
Data lake storage is designed for big data analytics and can handle structured, semi-structured, and unstructured data
Blob storage is optimized for storing unstructured data like images, videos, documents, etc.
Data lake storage allows for complex queries and ...
Get interview-ready with Top Tredence Interview Questions
I applied via Naukri.com and was interviewed in Sep 2023. There were 2 interview rounds.
Top trending discussions
I applied via Naukri.com and was interviewed in Sep 2024. There was 1 interview round.
To create a pipeline in ADF, you can use the Azure Data Factory UI or code-based approach.
Use Azure Data Factory UI to visually create and manage pipelines
Use code-based approach with JSON to define pipelines and activities
Add activities such as data movement, data transformation, and data processing to the pipeline
Set up triggers and schedules for the pipeline to run automatically
Activities in pipelines include data extraction, transformation, loading, and monitoring.
Data extraction: Retrieving data from various sources such as databases, APIs, and files.
Data transformation: Cleaning, filtering, and structuring data for analysis.
Data loading: Loading processed data into a data warehouse or database.
Monitoring: Tracking the performance and health of the pipeline to ensure data quality and reliab
getmetadata is used to retrieve metadata information about a dataset or data source.
getmetadata can provide information about the structure, format, and properties of the data.
It can be used to understand the data schema, column names, data types, and any constraints or relationships.
This information is helpful for data engineers to properly process, transform, and analyze the data.
For example, getmetadata can be used ...
Triggers in databases are special stored procedures that are automatically executed when certain events occur.
Types of triggers include: DML triggers (for INSERT, UPDATE, DELETE operations), DDL triggers (for CREATE, ALTER, DROP operations), and logon triggers.
Triggers can be classified as row-level triggers (executed once for each row affected by the triggering event) or statement-level triggers (executed once for eac...
Normal cluster is used for interactive workloads while job cluster is used for batch processing in Databricks.
Normal cluster is used for ad-hoc queries and exploratory data analysis.
Job cluster is used for running scheduled jobs and batch processing tasks.
Normal cluster is terminated after a period of inactivity, while job cluster is terminated after the job completes.
Normal cluster is more cost-effective for short-liv...
Slowly changing dimensions refer to data warehouse dimensions that change slowly over time.
SCDs are used to track historical changes in data over time.
There are three types of SCDs - Type 1, Type 2, and Type 3.
Type 1 SCDs overwrite old data with new data, Type 2 creates new records for changes, and Type 3 maintains both old and new data in separate columns.
Example: A customer's address changing would be a Type 2 SCD.
Ex...
Use Python's 'with' statement to ensure proper resource management and exception handling.
Use 'with' statement to automatically close files after use
Helps in managing resources like database connections
Ensures proper cleanup even in case of exceptions
List is mutable, tuple is immutable in Python.
List can be modified after creation, tuple cannot be modified.
List uses square brackets [], tuple uses parentheses ().
Lists are used for collections of items that may need to be changed, tuples are used for fixed collections of items.
Example: list_example = [1, 2, 3], tuple_example = (4, 5, 6)
Datalake 1 and Datalake 2 are both storage systems for big data, but they may differ in terms of architecture, scalability, and use cases.
Datalake 1 may use a Hadoop-based architecture while Datalake 2 may use a cloud-based architecture like AWS S3 or Azure Data Lake Storage.
Datalake 1 may be more suitable for on-premise data storage and processing, while Datalake 2 may offer better scalability and flexibility for clou...
To read a file in Databricks, you can use the Databricks File System (DBFS) or Spark APIs.
Use dbutils.fs.ls('dbfs:/path/to/file') to list files in DBFS
Use spark.read.format('csv').load('dbfs:/path/to/file') to read a CSV file
Use spark.read.format('parquet').load('dbfs:/path/to/file') to read a Parquet file
Star schema is denormalized with one central fact table surrounded by dimension tables, while snowflake schema is normalized with multiple related dimension tables.
Star schema is easier to understand and query due to denormalization.
Snowflake schema saves storage space by normalizing data.
Star schema is better for data warehousing and OLAP applications.
Snowflake schema is better for OLTP systems with complex relationsh
repartition increases partitions while coalesce decreases partitions in Spark
repartition shuffles data and can be used for increasing partitions for parallelism
coalesce reduces partitions without shuffling data, useful for reducing overhead
repartition is more expensive than coalesce as it involves data movement
example: df.repartition(10) vs df.coalesce(5)
Parquet file format is a columnar storage format used for efficient data storage and processing.
Parquet files store data in a columnar format, which allows for efficient querying and processing of specific columns without reading the entire file.
It supports complex nested data structures like arrays and maps.
Parquet files are highly compressed, reducing storage space and improving query performance.
It is commonly used ...
I applied via campus placement at VNR Vignan Jyothi Institute of Engineering & Technology, Ranga Reddy and was interviewed in Jun 2024. There were 3 interview rounds.
Coding and aptitude. Aptitude was really simple, you could rule out the options in the MCQ test. Coding round had 2 SQL and two python questions. Each topic had one simple and one hard question
They selected 75 percent of people. They just rules out those who didn't speak at all or spoke very little
based on 8 interviews
2 Interview rounds
based on 24 reviews
Rating in categories
Pune,
Chennai
+16-11 Yrs
Not Disclosed
Associate Manager
356
salaries
| ₹12.5 L/yr - ₹36.5 L/yr |
Consultant
340
salaries
| ₹6.5 L/yr - ₹20 L/yr |
Senior Business Analyst
267
salaries
| ₹6.5 L/yr - ₹17 L/yr |
Data Engineer
205
salaries
| ₹6 L/yr - ₹22 L/yr |
Business Analyst
173
salaries
| ₹6 L/yr - ₹12 L/yr |
Fractal Analytics
Mu Sigma
LatentView Analytics
AbsolutData