Weight of Evidence and Information Value
Weight of evidence(WOE):
where , and is the number of bins.
WOE and IV work for both continuous and categorical variables.
CONTINUOUS/CATEGORICAL->CATEGORICAL(discrete numeric values)
- CONTINUOUS: calculate pos and neg relative percentage of frequencies by intervals
- CATEGORICAL: calculate pos and neg relative percentage of frequencies by categories
Optionally there could be a MISSING bin.
put everything together:
(This data is made up and only for illustration of calculation)
- WOE of (e.g.) MISSING:
- IV of (e.g.) MISSING:
- Total IV:
- if %pos > %neg, WOE is positive
- if %pos < %neg, WOE is negative
- if %pos = %neg, WOE is 0
- IV is always positive