Deloitte
10+ AppZoro Technologies Interview Questions and Answers
Q1. What are the modules you've used in python?
I have used modules like pandas, numpy, matplotlib, and sklearn in Python for data manipulation, analysis, visualization, and machine learning tasks.
pandas - for data manipulation and analysis
numpy - for numerical computing and array operations
matplotlib - for data visualization
sklearn - for machine learning tasks
Q2. What is materialized view in bigquery?
Materialized view in BigQuery is a precomputed result set stored as a table for faster query performance.
Materialized views store the results of a query and can be used to speed up query performance by avoiding the need to recompute the same result multiple times.
They are updated periodically to reflect changes in the underlying data.
Materialized views are particularly useful for complex queries that involve aggregations or joins.
Example: CREATE MATERIALIZED VIEW my_materiali...read more
Q3. What are generators and decorators?
Generators and decorators are features in Python. Generators are functions that can pause and resume execution, while decorators are functions that modify other functions.
Generators are functions that use the yield keyword to return values one at a time, allowing for efficient memory usage.
Decorators are functions that take another function as input and return a new function with added functionality.
Generators can be used to iterate over large datasets without loading everyth...read more
Q4. What is partitioning and clustering?
Partitioning is dividing data into smaller parts based on a key, while clustering is storing data together based on similar values.
Partitioning is used to improve query performance by reducing the amount of data that needs to be scanned.
Clustering is used to physically store related data together on disk to improve query performance.
In BigQuery, partitioning can be done based on a date column, while clustering can be done based on one or more columns to group related data tog...read more
Q5. Difference between nearline and coldline.
Nearline is for data accessed less frequently, while coldline is for data accessed very infrequently.
Nearline storage is designed for data that is accessed less frequently but still needs to be readily available.
Coldline storage is for data that is accessed very infrequently and is stored at a lower cost.
Nearline storage has a higher retrieval cost compared to coldline storage.
Examples of nearline storage include Google Cloud Storage Nearline, while examples of coldline stora...read more
Q6. Different between big table and bigquery
BigTable is a NoSQL database for real-time analytics, while BigQuery is a fully managed data warehouse for running SQL queries.
BigTable is a NoSQL database designed for real-time analytics and high-throughput applications.
BigQuery is a fully managed data warehouse that allows users to run SQL queries on large datasets.
BigTable is optimized for high-speed reads and writes, making it suitable for real-time data processing.
BigQuery is optimized for running complex SQL queries on...read more
Q7. Explain lazy evaludation in spark.
Lazy evaluation in Spark delays the execution of transformations until an action is called.
Transformations in Spark are not executed immediately, but are stored as a directed acyclic graph (DAG) of operations.
Actions trigger the execution of the DAG, allowing for optimizations like pipelining and avoiding unnecessary computations.
Lazy evaluation helps in optimizing the execution plan and improving performance by delaying the actual computation until necessary.
Q8. Different storage types in GCP.
Different storage types in GCP include Cloud Storage, Persistent Disk, Cloud SQL, Bigtable, and Datastore.
Cloud Storage: object storage for storing and accessing data from Google Cloud
Persistent Disk: block storage for virtual machine instances
Cloud SQL: fully-managed relational database service
Bigtable: NoSQL wide-column database service for large analytical and operational workloads
Datastore: NoSQL document database for web and mobile applications
Q9. Views in google bigquery
Views in Google BigQuery are virtual tables that are defined by a SQL query.
Views allow users to save and reuse complex queries.
Views do not store data themselves, but rather provide a way to organize and simplify querying.
Views can be shared with other users in the same project.
Example: CREATE VIEW my_view AS SELECT * FROM my_table WHERE column = 'value';
Q10. Architecture of BigQuery.
BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data.
BigQuery uses a distributed architecture to process and analyze large datasets.
It separates storage and compute, allowing for independent scaling of each.
Data is stored in Capacitor, a proprietary storage format optimized for analytical processing.
Query processing is done in Dremel, a distributed system that can execute SQL queries on massive datasets.
BigQuery supports...read more
More about working at Deloitte
Interview Process at AppZoro Technologies
Reviews
Interviews
Salaries
Users/Month