for a data with 1000 samples and 700 dimensions, how would you find a line that best fits the data, to be able to extrapolate? this is not a supervised ML problem, there's no target. and how would you do it, if you want to treat this as a supervised ML problem? how would you pick the column to use as target? what are the potential problems to treating this as a supervised problem?
To find a line that best fits the data with 1000 samples and 700 dimensions, we can use linear regression.
For unsupervised ML approach, we can use Principal Component Analysis (PCA) to reduce dimensio...read more
while we can treat this problem as a supervised one by creating a synthetic target variable, we should exercise caution and carefully consider the implications and limitations of doing so. Additionall...read more
For the unsupervised approach, you could use Principal Component Analysis (PCA) to find the line that best fits the data. PCA is a technique that identifies the directions of maximum variance in the d...read more
We first perform eda and would find the dependent variable among 700 and also discard the redundant columns and then clean the data, after completing all this stuff, now we would do training the model...read more
Popular interview questions of Data Scientist
Top HR questions asked in Intellect Design Arena Data Scientist
Reviews
Interviews
Salaries
Users/Month