i
MathCo
Filter interviews by
Basic python programs, pandas, SQL
Basic python programs, pandas, SQL
Top trending discussions
Overfitting in decision trees occurs when the model learns noise in the training data rather than the underlying pattern.
Overfitting happens when the decision tree is too complex and captures noise in the training data.
It leads to poor generalization on unseen data, as the model is too specific to the training set.
To prevent overfitting, techniques like pruning, setting a minimum number of samples per leaf, or using en
Bagging is a machine learning ensemble technique where multiple models are trained on different subsets of the training data and their predictions are combined.
Bagging stands for Bootstrap Aggregating.
It helps reduce overfitting by combining the predictions of multiple models.
Random Forest is a popular algorithm that uses bagging by training multiple decision trees on random subsets of the data.
A neuron is a basic unit of a neural network that receives input, processes it, and produces an output.
Neurons are inspired by biological neurons in the human brain.
They receive input signals, apply weights to them, sum them up, and pass the result through an activation function.
Neurons are organized in layers in a neural network, with each layer performing specific tasks.
In deep learning, multiple layers of neurons ar...
I applied via Naukri.com and was interviewed before Oct 2023. There were 2 interview rounds.
Python coding and sql coding
List of strings starting with 'a'
Use a loop to iterate through each string
Check if each string starts with 'a'
Add the string to the list if it starts with 'a'
Use SQL query to find max value from a table
Use SQL query SELECT MAX(column_name) FROM table_name;
For example, SELECT MAX(salary) FROM employees;
Ensure proper column name and table name are used in the query
I applied via Campus Placement and was interviewed in Oct 2022. There were 4 interview rounds.
SQL and quant, logical questions were asked.
I applied via Naukri.com and was interviewed in Jul 2021. There was 1 interview round.
Slowly changing data handling in Spark involves updating data over time.
Slowly changing dimensions (SCD) are used to track changes in data over time.
SCD Type 1 updates the data in place, overwriting the old values.
SCD Type 2 creates a new record for each change, with a start and end date.
SCD Type 3 adds a new column to the existing record to track changes.
Spark provides functions like `from_unixtime` and `unix_timestam
Explanation of cumulative sum and rank functions in Spark
Cumulative sum function calculates the running total of a column
Rank function assigns a rank to each row based on the order of values in a column
Both functions can be used with window functions in Spark
Example: df.withColumn('cumulative_sum', F.sum('column').over(Window.orderBy('order_column').rowsBetween(Window.unboundedPreceding, Window.currentRow)))
Example: df...
I applied via AngelList and was interviewed before Jun 2021. There were 2 interview rounds.
I applied via Naukri.com and was interviewed in Sep 2021. There were 4 interview rounds.
Analyst
229
salaries
| ₹4 L/yr - ₹11 L/yr |
Senior Associate
227
salaries
| ₹10 L/yr - ₹28 L/yr |
Data Analyst
188
salaries
| ₹3 L/yr - ₹9.7 L/yr |
Associate
142
salaries
| ₹6 L/yr - ₹17 L/yr |
Data Scientist
127
salaries
| ₹6 L/yr - ₹19 L/yr |
Fractal Analytics
Mu Sigma
LatentView Analytics
Tiger Analytics