# Weight of Evidence and Information Value

Updated: 2019-01-13

## Calculation of Weight of Evidence(WOE)

Weight of evidence(WOE):

$WOE_i=\log({{\% pos_i} \over {\% neg_i}})$

where $i=1,2, ... k$, and $k$ is the number of bins.

## Calculation of Information Value(IV)

Information Value(IV):

$IV = \sum_{i=1}^k \{(\% pos_i - \% neg_i) \times WOE_i \}$

WOE and IV work for both continuous and categorical variables.

CONTINUOUS/CATEGORICAL->CATEGORICAL(discrete numeric values)

## Calculation

### Step 1: binning(out of the scope of this post)

• CONTINUOUS: calculate pos and neg relative percentage of frequencies by intervals
• CATEGORICAL: calculate pos and neg relative percentage of frequencies by categories

Optionally there could be a MISSING bin.

### Step 2: Calculate WOE for each bin

$WOE_i = \ln({\%pos_i \over \%neg_i}) = \ln({pos_i / \sum_i pos_i \over neg_i / \sum_i neg_i})$

### Step 3: Calculate IV

$IV_i = (\%pos_i - \%neg_i) * WOE_i$

### Step 4: Sum Up

$IV = \sum_{i=1}^k IV_i$

put everything together:

$IV = \sum_{i=1}^k \{(\%pos_i - \%neg_i)\ln({\%pos_i \over \%neg_i}) \}$

## Example

(This data is made up and only for illustration of calculation)

bin %pos %neg WOE IV
MISSING 0.1 0.05 0.693 0.035
1 0.15 0.05 1.099 0.110
2 0.15 0.1 0.405 0.020
3 0.2 0.2 0.0 0.0
4 0.2 0.25 -0.223 0.011
5 0.2 0.35 -0.560 0.084
Sum 1.0 1.0 0.260
• WOE of (e.g.) MISSING: $WOE_{MISSING} = \ln(0.1/0.05) = 0.693$
• IV of (e.g.) MISSING: $IV_{MISSING} = (0.1-0.05) * 0.693 = 0.035$
• Total IV: $0.035 + 0.110 + 0.020 + 0.0 + 0.011 + 0.084 = 0.260$

## Observations

• if %pos > %neg, WOE is positive
• if %pos < %neg, WOE is negative
• if %pos = %neg, WOE is 0
• IV is always positive