Data Science - Functions
Softmax
This model generalizes logistic regression to classification problems where the class label y can take on more than two possible values.
Sigmoid
Sigmod: "S"-shaped curve
logistic sigmoid, squashing function=>maps the whole real axis into a finite interval inverse: logit function: log odds
- logistic: https://en.wikipedia.org/wiki/Logistic_function
- sigmoid: https://en.wikipedia.org/wiki/Sigmoid_function
def sigmoid(z):
return 1.0 / (1 + np.exp(-z))
Sigmoid vs Logistic
Logistic is one kind of Sigmoid function(s-curve)
Sigmoid vs Tanh
- tanh: y in
[-1,1]
- sigmoid: y in
[0,1]
Rectifier
also known as a ramp function and is analogous to half-wave rectification in electrical engineering.
A unit employing the rectifier is also called a rectified linear unit (ReLU)
A smooth approximation to the rectifier is the analytic function, or softplus function
The derivative of softplus is logistic function.
A unit employing the rectifier is also called a rectified linear unit (ReLU).
https://en.wikipedia.org/wiki/Rectifier_(neural_networks)
tanh
wiki: https://en.wikipedia.org/wiki/Hyperbolic_function
tanh activation function is nothing but
There are two reasons for that choice (tanh) (assuming you have normalized your data, and this is very important):
- Having stronger gradients: since data is centered around 0, the derivatives are higher. To see this, calculate the derivative of the tanh function and notice that input values are in the range
[0,1]
. - Avoiding bias in the gradients. This is explained very well in the paper, and it is worth reading it to understand these issues.