i
Vyapar
Filter interviews by
To productinize Data Pipelines, one must automate, monitor, and scale the pipeline for efficient and reliable data processing.
Automate the data pipeline using tools like Apache Airflow or Kubernetes
Monitor the pipeline for errors, latency, and data quality issues using monitoring tools like Prometheus or Grafana
Scale the pipeline by optimizing code, using distributed computing frameworks like Spark, and leveraging clou...
BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data.
BigQuery uses a distributed architecture to process and analyze large datasets quickly.
It allows users to run SQL-like queries on datasets stored in Google Cloud Storage.
BigQuery automatically scales to handle large amounts of data and can be integrated with other Google Cloud services.
It supports real-time data...
Top trending discussions
I applied via campus placement at The LNM Institute of information Technology, Jaipur and was interviewed in Nov 2024. There was 1 interview round.
Versioning in AWS allows you to manage different versions of your resources.
AWS S3 supports object versioning to keep multiple versions of an object in the same bucket.
AWS Lambda supports versioning to manage different versions of your functions.
AWS API Gateway supports versioning to manage different versions of your APIs.
They are asking about SQL queries and BI dashboard knowledge
I am a data engineer with experience in BI dashboard development and overcoming challenges in the field.
I have worked on various data engineering projects, including building data pipelines and optimizing data storage.
One of my notable projects involved developing a real-time data processing system for a retail company, which improved their inventory management and sales forecasting.
In my previous role, I faced challen...
Answers to questions related to Data Models, ADF activities, and IR.
Data models are the representation of data structures and relationships between them.
ADF activities include data movement, data transformation, and control activities.
IR stands for Integration Runtime, which is a compute infrastructure used to provide data integration capabilities across different network environments.
I applied via Company Website and was interviewed in Aug 2024. There were 2 interview rounds.
It was Python coding and Sql query assessment round.
Joins are used to combine rows from two or more tables based on a related column between them.
Use JOIN keyword to combine tables based on a common column
Types of joins include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN
Example: SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id
Projects are specific tasks or assignments that require a set of skills and resources to complete.
Projects are temporary endeavors with a defined beginning and end.
They are unique, with specific goals, deliverables, and constraints.
Projects require a team of individuals with different roles and responsibilities.
Examples: Developing a data pipeline for real-time analytics, building a recommendation system for an e-comme
I applied via Walk-in and was interviewed in May 2024. There was 1 interview round.
Indexing in SQL is a technique used to improve the performance of queries by creating a data structure that allows for faster retrieval of data.
Indexes are created on columns in a database table to speed up the retrieval of data.
They work similar to the index in a book, allowing the database to quickly find the rows that match a certain condition.
Indexes can be created using a single column or a combination of columns.
...
I have extensive experience in using Pyspark and Python for data engineering tasks.
I have worked on various projects involving data processing, transformation, and analysis using Pyspark and Python.
I am proficient in writing efficient and optimized code in Pyspark for big data processing.
I have experience in handling large datasets and implementing complex data pipelines using Pyspark and Python.
They are asking about SQL queries and BI dashboard knowledge
Answers to questions related to Data Engineer role
Data models are the logical representation of data objects and their relationships
ADF activities include data movement, data transformation, and control activities
IR stands for Integration Runtime, which is a compute infrastructure used to provide data integration capabilities
posted on 22 Nov 2024
I applied via Newspaper Ad and was interviewed in May 2024. There were 3 interview rounds.
Programing questiion star pattern
ETL is Extract, Transform, Load where data is extracted, transformed, and loaded in that order. ELT is Extract, Load, Transform where data is extracted, loaded, and then transformed.
ETL: Data is extracted from the source, transformed in a separate system, and then loaded into the target system.
ELT: Data is extracted from the source, loaded into the target system, and then transformed within the target system.
ETL is sui...
The difference between the difference is the result of subtracting one value from another.
Difference is the result of subtracting two values.
The difference between two values can be positive, negative, or zero.
For example, the difference between 10 and 5 is 5.
Find the repeating element in a list
Iterate through the list and keep track of elements seen so far
Use a hash set to efficiently check for duplicates
Return the first element that is already in the set
Use the MAX() function in SQL to calculate the maximum salary.
Use the MAX() function along with the column name of the salary field.
Example: SELECT MAX(salary) FROM employees;
Ensure the correct table and column names are used in the query.
Business Development Executive
134
salaries
| ₹2.8 L/yr - ₹5.4 L/yr |
Accounts Manager
80
salaries
| ₹2.8 L/yr - ₹6 L/yr |
Inside Sales Executive
78
salaries
| ₹2.2 L/yr - ₹4.5 L/yr |
Team Lead
64
salaries
| ₹4 L/yr - ₹7.8 L/yr |
Customer Support Executive
62
salaries
| ₹2.8 L/yr - ₹4.2 L/yr |
Zoho
Tally Solutions
MARG ERP
Busy Infotech