logo

Machine Learning - Model Evaluation

Confusion Matrix

Actual \ Predicted Positive Negative
Positive True Positive False Negative
Negative False Positive True Negative

Derivations

  • Precision
P r e c i s i o n = T P A c t i o n R e c o r d s = T P T P + F P Precision = {TP \over ActionRecords} = {TP \over TP+FP}
  • True Positive Rate(TPR), Sensitivity, Recall, HitRate,
T P R = S e n s i t i v i t y = R e c a l l = H i t R a t e = T P A l l P o s = T P T P + F N TPR=Sensitivity=Recall=HitRate= {TP \over AllPos} = {TP \over TP+FN}
  • Specificity
S p e c i f i c i t y = T N A l l N e g = T N T N + F P Specificity= {TN \over AllNeg}= {TN\over TN+FP}
  • False Positive Rate(FPR)
F P R = 1 S p e c i f i c i t y = F P A l l N e g = F P T N + F P FPR=1-Specificity= {FP \over AllNeg}= {FP\over TN+FP}
  • ActionRate
A c t i o n R a t e = A c t i o n R e c o r d s A l l R e c o r d s = T P + F P A l l R e c o r d s ActionRate = {ActionRecords \over AllRecords} = {TP+FP \over AllRecords}
  • F-measure / F1 Score
F 1 = 2 P r e c i s i o n R e c a l l P r e c i s i o n + R e c a l l F_1 = 2\cdot {Precision \cdot Recall \over Precision + Recall}

Illustration

T P R = T P T P + F N TPR = {TP \over TP+FN} F P R = F P T N + F P FPR = {FP\over TN+FP} P r e c i s i o n = T P T P + F P Precision = {TP \over TP+FP} R e c a l l = T P + F P T P + F P + T N + F N Recall = {TP+FP \over TP+FP+TN+FN}

Curves

Receiver Operating Characteristic (ROC)

One point in ROC space is superior to another if it is to the northwest of the first

  • x-Axis: FPR
  • y-Axis: TPR(CatchRate)

Precision-Recall (PR)

  • x-Axis: Recall(HitRate)
  • y-Axis: Precision

Lift

  • x-Axis: ActionRate(% Total)
  • y-Axis: Lift

Random: (AllPositive / Total) _ Action = (TP + FN) / (TP + FP + TN + FN) _ (TP + FP) UseModel: TP

Lift = UseModel / Random = TP / ((TP + FN) / (TP + FP + TN + FN) * (TP + FP))

Gain

  • x-Axis: ActionRate(% Total)
  • y-Axis: HitRate(% Positive)