Machine Learning/Data Science Interview Questions
1. "What is X?"
The basic concepts.
 Statistics questions such as what is an Ftest
 implement logistic regression training for binary classification
 explain overfit, underfit, bias, variance and their relations
 gradient descent
 L1/L2 regularization
 Bayes Theorem
 collaborative filtering
 dimension reduction
 what is batch normalization ? What benefit it gives ?
 Explain naive Bayes. (What is independent ?) Explain how to use it to build a spam filter .
 Explain ROC. What is the curve if we do a random guess ? What is different between two points on the ROC curve ? Explain PrecisionRecallCurve. Explain confusion matrix. Explain F1score, why do we use it ?
2. "Compare X and Y" or "Pros and Cons" of X
One step further beyond basic concepts, needs a better understanding of the topics
 biasvariance tradeoff
 bagging vs boosting
 Difference between a convex and nonconvex solution
 why stochastic gradient descent is appropriate for distributed training
 how XGBoost differs from traditional GBDT, e.g. what is special about its loss function, why it needs to compute the second order derivative
 AdaptiveBoost vs GradientBoost.
What are the problems with feature importance in Random Forest and Gradient Boosted Tree?
 Feature selection based on impurity reduction is biased towards preferring variables with more categories
 With correlated features, strong features can end up with low scores
Check out the Versus page
3. Practical Questions
Needs deeper understanding of the topic or hands on experiences.
 How do you adjust the cost parameter for the SVM regularizer
 How to assess the quality of clustering, especially to know when you have the right number of clusters
 How do you pick the features to use
 model: over calibration issue
4. Design Questions
"How would you approach ..."
Question about a real world problem:
 How would you approach the Netflix Prize?
 How would you generate related searches on Bing?
 How would you suggest followers on Twitter?
More Questions

describe how a decision tree works, from the viewpoint of "information gain". Why pruning may help ? what benefit we get from pruning a tree ?

What is random forest ? How to use bagging trick to make RF ? Does RF need pruning and Why ?

What's difference between Sigmoid and ReLu ? Their advantages and disadvantages ? (sparsity, gradient vanish , activation blow up, complexit )

what Optimizer you used in your DL model ? Explain AdamOpt, Momentum, SGD.

Explain transfer learning and finetune. Can you arbitrarily take out one layer from CNN model ? Why ? Can you run a CNN on different sizes of images ? Why ?

Explain learning rate decay, and why use it ? Explain L2 regularization, and why use it ? What's relation/difference between weight decay and L2 reg ?

Explain Kfold cross validation. How do you use it to train your model ?

Explain LR (linear regression), OLS (ordinary least square) model, and PCA. What's the difference/relation between them ?

Does PCA give us largest variance or smallest variance when we use it to compress data ? Explain why. Bonus question: explain Linear Discriminant Analysis and its difference from PCA.

If your data is corrupted by noise , how the noise affect you model, overfit or underfit ? Why ?

How the K value affect KNN model ? Larger K overfits or underfits ? Smaller K overfit or underfit ?

What is the major problem with RNNBPTT ? How come the gradient may vanish or explode ?

Illustrate basic ideas of collaborative filtering , and matrix factorization

Compare Kmeans with Gaussian mixture. Relation and difference ?

why use minibatch in training ? Why not just use SGD , or just use all training data in the whole batch when updating the gradient ? Why use momentum ( taking the history of gradients ) when we use SGD ?

How would you sample uniformly from a continuous stream of data? (or Randomly Pick n elements from a given array of m elements.)Reservoir Sampling.
Links
https://resources.workable.com/machinelearningengineerinterviewquestions
http://www.galvanize.com/learn/learntocode/commondatascienceinterviewquestions/ https://algorithmsdatascience.quora.com/SummaryofsomeInterviewquestions https://www.quora.com/Whataresomecommonmachinelearninginterviewquestions