i
eClerx
Filter interviews by
posted on 11 Dec 2024
PySpark is a Python API for Apache Spark, used for big data processing and analytics.
PySpark is a Python API for Apache Spark, a fast and general-purpose cluster computing system.
It allows for easy integration with Python libraries and provides high-level APIs in Python.
PySpark can be used for processing large datasets, machine learning, real-time data streaming, and more.
It supports various data sources such as HDFS, ...
PySpark is a Python API for Apache Spark, while Python is a general-purpose programming language.
PySpark is specifically designed for big data processing using Spark, while Python is a versatile programming language used for various applications.
PySpark allows for distributed computing and parallel processing, while Python is primarily used for sequential programming.
PySpark provides libraries and tools for working wit...
I applied via Campus Placement and was interviewed in Oct 2024. There was 1 interview round.
Use regular expression to remove special characters from a string
Use the regex pattern [^a-zA-Z0-9\s] to match any character that is not a letter, digit, or whitespace
Use the replace() function in your programming language to replace the matched special characters with an empty string
Example: input string 'Hello! How are you?' will become 'Hello How are you' after removing special characters
I applied via Naukri.com and was interviewed in Nov 2024. There was 1 interview round.
Databricks is a unified analytics platform that provides a collaborative environment for data scientists, engineers, and analysts.
Databricks is built on top of Apache Spark, providing a unified platform for data engineering, data science, and business analytics.
Internals of Databricks include a cluster manager, job scheduler, and workspace for collaboration.
Optimization techniques in Databricks include query optimizati...
posted on 9 Dec 2024
I applied via LinkedIn and was interviewed in Jun 2024. There were 3 interview rounds.
General question around data engineering
I applied via Approached by Company and was interviewed in Apr 2024. There was 1 interview round.
I have handled terabytes of data in my POCs, including data from various sources and formats.
Handled terabytes of data in POCs
Worked with data from various sources and formats
Used tools like Hadoop, Spark, and SQL for data processing
Repartition is used for increasing partitions for parallelism, while coalesce is used for decreasing partitions to reduce shuffling.
Repartition is used when there is a need for more partitions to increase parallelism.
Coalesce is used when there are too many partitions and need to reduce them to avoid shuffling.
Example: Repartition can be used before a join operation to evenly distribute data across partitions for bette...
Designing/configuring a cluster for 10 petabytes of data involves considerations for storage capacity, processing power, network bandwidth, and fault tolerance.
Consider using a distributed file system like HDFS or object storage like Amazon S3 to store and manage the large volume of data.
Implement a scalable processing framework like Apache Spark or Hadoop to efficiently process and analyze the data in parallel.
Utilize...
Use SQL query to find the product with the second highest total price.
Use the ORDER BY clause to sort the products by total price in descending order
Use the LIMIT clause to select the second row after sorting
I applied via Job Portal and was interviewed in Aug 2024. There were 3 interview rounds.
Its mandatory test even for experience people
My strengths include strong analytical skills, attention to detail, and problem-solving abilities.
Strong analytical skills - able to analyze complex data sets and derive meaningful insights
Attention to detail - meticulous in ensuring data accuracy and quality
Problem-solving abilities - adept at identifying and resolving data-related issues
Experience with data manipulation tools like SQL, Python, and Spark
Seeking new challenges and growth opportunities in a different environment.
Looking for new challenges to enhance my skills and knowledge
Seeking growth opportunities that align with my career goals
Interested in exploring different technologies and industries
Want to work in a more collaborative team environment
Seeking better work-life balance or location proximity
15-20 Yrs
₹ 40-55 LPA
Senior Analyst
5.4k
salaries
| ₹2 L/yr - ₹8 L/yr |
Financial Analyst
4k
salaries
| ₹1.2 L/yr - ₹4.8 L/yr |
Analyst
4k
salaries
| ₹1 L/yr - ₹7.5 L/yr |
Associate Process Manager
2.4k
salaries
| ₹3.8 L/yr - ₹14.5 L/yr |
Processing Manager
1.7k
salaries
| ₹6 L/yr - ₹20 L/yr |
Genpact
WNS
TCS
Infosys