Filter interviews by
Use SQL query to find the product with the second highest total price.
Use the ORDER BY clause to sort the products by total price in descending order
Use the LIMIT clause to select the second row after sorting
I applied via Naukri.com and was interviewed in Mar 2022. There were 3 interview rounds.
Top trending discussions
posted on 11 Dec 2024
PySpark is a Python API for Apache Spark, used for big data processing and analytics.
PySpark is a Python API for Apache Spark, a fast and general-purpose cluster computing system.
It allows for easy integration with Python libraries and provides high-level APIs in Python.
PySpark can be used for processing large datasets, machine learning, real-time data streaming, and more.
It supports various data sources such as HDFS, ...
PySpark is a Python API for Apache Spark, while Python is a general-purpose programming language.
PySpark is specifically designed for big data processing using Spark, while Python is a versatile programming language used for various applications.
PySpark allows for distributed computing and parallel processing, while Python is primarily used for sequential programming.
PySpark provides libraries and tools for working wit...
I applied via Campus Placement and was interviewed in Oct 2024. There was 1 interview round.
Use regular expression to remove special characters from a string
Use the regex pattern [^a-zA-Z0-9\s] to match any character that is not a letter, digit, or whitespace
Use the replace() function in your programming language to replace the matched special characters with an empty string
Example: input string 'Hello! How are you?' will become 'Hello How are you' after removing special characters
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
Databricks is a unified analytics platform that provides a collaborative environment for data scientists, engineers, and analysts.
Databricks is built on top of Apache Spark, providing a unified platform for data engineering, data science, and business analytics.
Internals of Databricks include a cluster manager, job scheduler, and workspace for collaboration.
Optimization techniques in Databricks include query optimizati...
posted on 9 Dec 2024
I applied via LinkedIn and was interviewed in Jun 2024. There were 3 interview rounds.
General question around data engineering
I applied via Approached by Company and was interviewed in Apr 2024. There was 1 interview round.
I have handled terabytes of data in my POCs, including data from various sources and formats.
Handled terabytes of data in POCs
Worked with data from various sources and formats
Used tools like Hadoop, Spark, and SQL for data processing
Repartition is used for increasing partitions for parallelism, while coalesce is used for decreasing partitions to reduce shuffling.
Repartition is used when there is a need for more partitions to increase parallelism.
Coalesce is used when there are too many partitions and need to reduce them to avoid shuffling.
Example: Repartition can be used before a join operation to evenly distribute data across partitions for bette...
Designing/configuring a cluster for 10 petabytes of data involves considerations for storage capacity, processing power, network bandwidth, and fault tolerance.
Consider using a distributed file system like HDFS or object storage like Amazon S3 to store and manage the large volume of data.
Implement a scalable processing framework like Apache Spark or Hadoop to efficiently process and analyze the data in parallel.
Utilize...
I applied via Job Portal and was interviewed in Aug 2024. There were 3 interview rounds.
Its mandatory test even for experience people
based on 3 reviews
Rating in categories
Senior Engineer
5.3k
salaries
| ₹5 L/yr - ₹17 L/yr |
Engineer
4.4k
salaries
| ₹1 L/yr - ₹8.9 L/yr |
Technical Lead
2k
salaries
| ₹8.2 L/yr - ₹26.9 L/yr |
Project Lead
1.6k
salaries
| ₹6 L/yr - ₹22.1 L/yr |
Senior Software Engineer
1.4k
salaries
| ₹4.8 L/yr - ₹18.2 L/yr |
TCS
Infosys
Wipro
Tech Mahindra