Jeff Dean talks about the hesitation of the Search Ranking Team at Google in using a neural net for search ranking. With search ranking you want to be able to understand the model, you want to understand why it is making certain decisions. When the system makes a mistake they want to understand why it is doing what it is doing.
- no cycles or loops, different from recurrent neural networks.
- the first and simplest type of artificial neural network
- The simplest kind of neural network is a single-layer perceptron network, which consists of a single layer of output nodes
- A multi-layer neural network can compute a continuous output instead of a step function
- the single-layer network is identical to the logistic regression model, by applying logistic function
- logistic function=sigmoid function
Backpropagation requires the derivative of the loss function with respect to the network output to be known, which typically (but not necessarily) means that a desired target value is known. For this reason it is considered to be a supervised learning method, although it is used in some unsupervisednetworks such as autoencoders.
For a feedforward neural network, the depth of the CAPs is that of the network and is the number of hidden layers plus one (as the output layer is also parameterized). For recurrent neural networks, in which a signal may propagate through a layer more than once, the CAP depth is potentially unlimited.
Deep learning involves CAP depth > 2 (more than 1 hidden layer)
CAP of depth 2 has been shown to be a universal approximator in the sense that it can emulate any function.
Beyond that more layers do not add to the function approximator ability of the network. The extra layers help in learning features.
Deep Learning models:
- MLPs with 5-10 layers
- resnet-50-like CNNs
Convolutional neural networks (CNN)
- a type of feed-forward artificial neural network - are variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing.
- take a fixed size input and generate fixed-size outputs.
- designed to recognize images by having convolutions inside, which see the edges of an object recognized on the image.
- e.g. Sentiment analysis: given some review, predict the rating of the review.
Recurrent neural networks (RNN)
- not feedforward neural networks
- can use their internal memory to process arbitrary sequences(any length) of inputs, but would typically require much more data compared to conv-nets because it is a more complex model.
- designed to recognize sequences, for example, a speech signal or a text. It has cycles inside that implies the presence of short memory in the net.
- e.g. Machine translation: Translate a sentence from some source language to target language.
Recursive Neural Network
- more like a hierarchical network where there is really no time aspect to the input sequence but the input has to be processed hierarchically in a tree fashion.
A convolutional network is basically a standard neural network that's been extended across space using shared weights. A recurrent neural network is basically a standard neural network that's been extended across time by having edges which feed into the next time step instead of into the next layer in the same time step.
“There are some empirically-derived rules-of-thumb, of these, the most commonly relied on is 'the optimal size of the hidden layer is usually between the size of the input and size of the output layers’” — Jeff Heaton (Encog author)