Cognizant
10+ Envoy Global Interview Questions and Answers
Q1. GCP Services, What is use of Bigquery? What is Pubsub,Dataflow,cloud storage. Question related previous roles and responsibility.
Bigquery is a cloud-based data warehousing tool used for analyzing large datasets quickly. Pubsub is a messaging service, Dataflow is a data processing tool, and Cloud Storage is a scalable object storage service.
Bigquery is used for analyzing large datasets quickly
Pubsub is a messaging service used for asynchronous communication between applications
Dataflow is a data processing tool used for batch and stream processing
Cloud Storage is a scalable object storage service used f...read more
Q2. What is GCP Bigquery, Architecture of BQ, Cloud composer, What Is DAG . Visualization studio like Looker, data studio.
GCP BigQuery is a serverless, highly scalable, and cost-effective data warehouse for analyzing big data sets.
BigQuery is a fully managed, petabyte-scale data warehouse that enables super-fast SQL queries using the processing power of Google's infrastructure.
BigQuery's architecture includes storage, Dremel execution engine, and SQL layer.
Cloud Composer is a managed workflow orchestration service that helps you create, schedule, and monitor pipelines using Apache Airflow.
DAG (D...read more
Q3. bq commands on show the schema of the table
Use 'bq show' command to display the schema of a table in BigQuery.
Use 'bq show' command followed by the dataset and table name to display the schema.
The schema includes the column names, data types, and mode (nullable or required).
Example: bq show project_id:dataset.table_name
Q4. What are the GCP services used in your project
The GCP services used in our project include BigQuery, Dataflow, Pub/Sub, and Cloud Storage.
BigQuery for data warehousing and analytics
Dataflow for real-time data processing
Pub/Sub for messaging and event ingestion
Cloud Storage for storing data and files
Q5. How to shedule job to trigger every hr in Airflow
To schedule a job to trigger every hour in Airflow, you can use the Cron schedule interval
Define a DAG (Directed Acyclic Graph) in Airflow
Set the schedule_interval parameter to '0 * * * *' to trigger the job every hour
Example: schedule_interval='0 * * * *'
Q6. How display string in reverse using python
Use Python's slicing feature to display a string in reverse order.
Use string slicing with a step of -1 to reverse the string.
Example: 'hello'[::-1] will output 'olleh'.
Q7. what are the data sources used?
Various data sources such as databases, APIs, files, and streaming services are used for data ingestion and processing.
Databases (e.g. MySQL, PostgreSQL)
APIs (e.g. RESTful APIs)
Files (e.g. CSV, JSON)
Streaming services (e.g. Kafka, Pub/Sub)
Q8. how many slots are there in bigquery?
BigQuery does not have fixed slots, it dynamically allocates resources based on the query requirements.
BigQuery does not have a fixed number of slots like traditional databases.
It dynamically allocates resources based on the query requirements.
The number of slots available for a query can vary depending on the complexity and size of the query.
BigQuery's serverless architecture allows it to scale automatically to handle large workloads.
Q9. explain about leaf nodes and columnar storage.
Leaf nodes are the bottom nodes in a tree structure, while columnar storage stores data in columns rather than rows.
Leaf nodes are the end nodes in a tree structure, containing actual data or pointers to data.
Columnar storage stores data in columns rather than rows, allowing for faster query performance on specific columns.
Columnar storage is commonly used in data warehouses and analytics databases.
Leaf nodes are important for efficient data retrieval in tree-based data struc...read more
Q10. bq commands on create table and load csv file
Using bq commands to create a table and load a CSV file in Google BigQuery
Use 'bq mk' command to create a new table in BigQuery
Use 'bq load' command to load a CSV file into the created table
Specify schema and source format when creating the table
Specify source format and destination table when loading the CSV file
Example: bq mk --table dataset.table_name schema.json
Example: bq load --source_format=CSV dataset.table_name data.csv
Q11. What is cloud function
Cloud Functions are event-driven functions that run in response to cloud events.
Serverless functions that automatically scale based on demand
Can be triggered by events from various cloud services
Supports multiple programming languages like Node.js, Python, etc.
Q12. partition vs clustering
Partitioning is dividing data into smaller chunks for efficient storage and retrieval, while clustering is organizing data within those partitions based on a specific column.
Partitioning is done at the storage level to distribute data across multiple nodes for better performance.
Clustering is done at the query level to physically group data based on a specific column, improving query performance.
Example: Partitioning by date in a sales database can improve query performance b...read more
More about working at Cognizant
Interview Process at Envoy Global
Top Gcp Data Engineer Interview Questions from Similar Companies
Reviews
Interviews
Salaries
Users/Month