Machine Learning - Model Evaluation

Updated: 2018-06-30

Confusion Matrix

Actual \ Predicted Positive Negative
Positive True Positive False Negative
Negative False Positive True Negative

Derivations

  • Precision
Precision=TPActionRecords=TPTP+FPPrecision = {TP \over ActionRecords} = {TP \over TP+FP}
  • True Positive Rate(TPR), Sensitivity, Recall, HitRate,
TPR=Sensitivity=Recall=HitRate=TPAllPos=TPTP+FNTPR=Sensitivity=Recall=HitRate= {TP \over AllPos} = {TP \over TP+FN}
  • Specificity
Specificity=TNAllNeg=TNTN+FPSpecificity= {TN \over AllNeg}= {TN\over TN+FP}
  • False Positive Rate(FPR)
FPR=1Specificity=FPAllNeg=FPTN+FPFPR=1-Specificity= {FP \over AllNeg}= {FP\over TN+FP}
  • ActionRate
ActionRate=ActionRecordsAllRecords=TP+FPAllRecordsActionRate = {ActionRecords \over AllRecords} = {TP+FP \over AllRecords}
  • F-measure / F1 Score
F1=2PrecisionRecallPrecision+RecallF_1 = 2\cdot {Precision \cdot Recall \over Precision + Recall}

Illustration

TPR=TPTP+FNTPR = {TP \over TP+FN} FPR=FPTN+FPFPR = {FP\over TN+FP} Precision=TPTP+FPPrecision = {TP \over TP+FP} Recall=TP+FPTP+FP+TN+FNRecall = {TP+FP \over TP+FP+TN+FN}

Curves

Receiver Operating Characteristic (ROC)

One point in ROC space is superior to another if it is to the northwest of the first

  • x-Axis: FPR
  • y-Axis: TPR(CatchRate)

Precision-Recall (PR)

  • x-Axis: Recall(HitRate)
  • y-Axis: Precision

Lift

  • x-Axis: ActionRate(% Total)
  • y-Axis: Lift

Random: (AllPositive / Total) _ Action = (TP + FN) / (TP + FP + TN + FN) _ (TP + FP) UseModel: TP

Lift = UseModel / Random = TP / ((TP + FN) / (TP + FP + TN + FN) * (TP + FP))

Gain

  • x-Axis: ActionRate(% Total)
  • y-Axis: HitRate(% Positive)