Filter interviews by
The question seems to be incomplete or unclear, possibly a mistake in transcription.
Ask for clarification or more context from the interviewer.
Confirm if the question was meant to be asked in a different way.
Offer to provide a response based on a different question related to data engineering.
Python code with different case
Python is case-sensitive, so variables with different case are treated as different variables
It is recommended to use consistent naming conventions to avoid confusion
Examples: 'myVar', 'myvar', and 'MYVAR' are three different variables
Top trending discussions
I applied via Naukri.com and was interviewed in Oct 2024. There were 2 interview rounds.
Spark performance problems can arise due to inefficient code, data skew, resource constraints, and improper configuration.
Inefficient code can lead to slow performance, such as using collect() on large datasets.
Data skew can cause uneven distribution of data across partitions, impacting processing time.
Resource constraints like insufficient memory or CPU can result in slow Spark jobs.
Improper configuration settings, su...
I applied via LinkedIn and was interviewed in Nov 2024. There was 1 interview round.
Aptitude test involved with quantative aptitude, logical reasoning and reading comprehensions.
I have strong skills in data processing, ETL, data modeling, and programming languages like Python and SQL.
Proficient in data processing and ETL techniques
Strong knowledge of data modeling and database design
Experience with programming languages like Python and SQL
Familiarity with big data technologies such as Hadoop and Spark
Yes, I am open to relocating for the right opportunity.
I am willing to relocate for the right job opportunity.
I have experience moving for previous roles.
I am flexible and adaptable to new locations.
I am excited about the possibility of exploring a new city or country.
posted on 11 Dec 2024
PySpark is a Python API for Apache Spark, used for big data processing and analytics.
PySpark is a Python API for Apache Spark, a fast and general-purpose cluster computing system.
It allows for easy integration with Python libraries and provides high-level APIs in Python.
PySpark can be used for processing large datasets, machine learning, real-time data streaming, and more.
It supports various data sources such as HDFS, ...
PySpark is a Python API for Apache Spark, while Python is a general-purpose programming language.
PySpark is specifically designed for big data processing using Spark, while Python is a versatile programming language used for various applications.
PySpark allows for distributed computing and parallel processing, while Python is primarily used for sequential programming.
PySpark provides libraries and tools for working wit...
I applied via Campus Placement and was interviewed in Oct 2024. There was 1 interview round.
Use regular expression to remove special characters from a string
Use the regex pattern [^a-zA-Z0-9\s] to match any character that is not a letter, digit, or whitespace
Use the replace() function in your programming language to replace the matched special characters with an empty string
Example: input string 'Hello! How are you?' will become 'Hello How are you' after removing special characters
Databricks is a unified analytics platform that provides a collaborative environment for data scientists, engineers, and analysts.
Databricks is built on top of Apache Spark, providing a unified platform for data engineering, data science, and business analytics.
Internals of Databricks include a cluster manager, job scheduler, and workspace for collaboration.
Optimization techniques in Databricks include query optimizati...
posted on 9 Dec 2024
I applied via LinkedIn and was interviewed in Jun 2024. There were 3 interview rounds.
General question around data engineering
I applied via Company Website and was interviewed in Jul 2024. There was 1 interview round.
Pods are the smallest deployable units in Kubernetes, consisting of one or more containers.
Pods are used to run and manage containers in Kubernetes
Each pod has its own unique IP address within the Kubernetes cluster
Pods can contain multiple containers that share resources and are scheduled together
Pods are ephemeral and can be easily created, destroyed, or replicated
Pods can be managed and scaled using Kubernetes contr
based on 1 review
Rating in categories
Software Engineer
13
salaries
| ₹5 L/yr - ₹10 L/yr |
Senior Software Engineer
11
salaries
| ₹8.5 L/yr - ₹13.8 L/yr |
Softwaretest Engineer
5
salaries
| ₹3 L/yr - ₹9.5 L/yr |
Software Developer
4
salaries
| ₹5.2 L/yr - ₹13 L/yr |
QA Engineer
4
salaries
| ₹3.2 L/yr - ₹5.5 L/yr |
TCS
Infosys
Wipro
HCLTech