Filter interviews by
Be the first one to contribute and help others!
A lambda function in Python is a small anonymous function defined using the lambda keyword.
Lambda functions can have any number of arguments, but can only have one expression.
Syntax: lambda arguments : expression
Example: lambda x, y : x + y
dbutils is a utility provided by Databricks for interacting with files and directories in the Databricks environment.
dbutils.fs.ls('/') - list files in root directory
dbutils.fs.cp('dbfs:/file.txt', 'file.txt') - copy file from DBFS to local file system
dbutils.fs.mkdirs('dbfs:/new_dir') - create a new directory in DBFS
A commit in SQL is a command that saves all the changes made in a transaction to the database.
A commit is used to make all the changes made in a transaction permanent.
Once a commit is issued, the changes cannot be rolled back.
It is important to use commit to ensure data integrity and consistency.
Example: COMMIT; - this command is used to commit the changes in a transaction.
I applied via Naukri.com and was interviewed in Jun 2021. There were 4 interview rounds.
Object-oriented programming (OOP) knowledge is an advantage but not necessary for a data engineer.
OOP concepts like inheritance, encapsulation, and polymorphism can be useful in designing data models.
OOP languages like Java and Python are commonly used in data engineering.
Understanding OOP can help with debugging and maintaining code.
However, OOP is not a requirement for data engineering and other programming paradigms
I applied via Job Portal
There is a test where you build data pipeline
Spark can be used for real-time data processing in streaming use cases.
Spark Streaming allows for processing real-time data streams.
It can handle high-throughput and fault-tolerant processing.
Examples include real-time analytics, monitoring, and alerting.
I appeared for an interview before Mar 2024.
The group discussion (GD) round is expected to last 20 minutes. The topics were straightforward and easily comprehensible. The primary focus when participating in the GD should be on English fluency. It is not primarily about how content-rich or intellectually impressive your speech is, but rather about the level of fluency in communication.
I applied via Naukri.com and was interviewed in Dec 2024. There were 2 interview rounds.
It was well designed
Surrogate key is a unique identifier used in databases to uniquely identify each record in a table.
Surrogate keys are typically generated by the system and have no business meaning.
They are used to simplify database operations and improve performance.
Example: Using an auto-incrementing integer column as a surrogate key in a table.
I applied via Company Website and was interviewed in Oct 2024. There were 4 interview rounds.
Basic Python, SQL, and Bash questions
Data pipeline design involves creating a system to efficiently collect, process, and analyze data.
Understand the data sources and requirements before designing the pipeline.
Use tools like Apache Kafka, Apache NiFi, or AWS Glue for data ingestion and processing.
Implement data validation and error handling mechanisms to ensure data quality.
Consider scalability and performance optimization while designing the pipeline.
Doc...
Big data refers to large and complex data sets that are difficult to process using traditional data processing applications.
Big data involves large volumes of data
It includes data from various sources such as social media, sensors, and business transactions
Big data requires specialized tools and technologies for processing and analysis
Spark is a distributed computing framework that processes big data in memory and is known for its speed and ease of use.
Spark is a distributed computing framework that can process data in memory for faster processing.
It uses Resilient Distributed Datasets (RDDs) for fault-tolerant distributed data processing.
Spark provides high-level APIs in Java, Scala, Python, and R for ease of use.
It supports various data sources li...
Our application is a data engineering platform that processes and analyzes large volumes of data to provide valuable insights.
Our application uses various data processing techniques such as ETL (Extract, Transform, Load) to clean and transform raw data into usable formats.
We utilize big data technologies like Hadoop, Spark, and Kafka to handle large datasets efficiently.
The application also includes machine learning al...
Factorial coding questions and SQL coding questions using group by
Amdocs is a software and services provider for communications, media, and entertainment industries.
Founded in 1982 in Israel
Headquartered in Chesterfield, Missouri
Provides customer experience solutions for telecom companies
Offers services such as billing, CRM, and data analytics
Software Engineer
135
salaries
| ₹20 L/yr - ₹70 L/yr |
Software Developer
112
salaries
| ₹20 L/yr - ₹85 L/yr |
Senior Software Engineer
106
salaries
| ₹25 L/yr - ₹91.5 L/yr |
Sde1
57
salaries
| ₹28 L/yr - ₹86 L/yr |
Software Development Engineer II
43
salaries
| ₹25 L/yr - ₹96 L/yr |
Salesforce
Amazon
Oracle