Frameworks, Tools and Libs
A typical Data Scientist/ML Engineer's toolbox.
- Scikit Learn
- [Google] Tensorflow
- [Facebook] Caffe2
- [Facebook] PyTorch
[Microsoft] CNTK: Cognitive Toolkit
- Apache MXNet http://gluon.mxnet.io/
- [Deprecated] Theano, Lasagne
- Pregel: http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html
- GraphLab: https://turi.com/
- Neo4j: http://www.neo4j.org/
- A replacement for NumPy to use the power of GPUs
- Tensors are similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.
Tensor is n-dimensional array that stores data of the same type, for example
[[1, 2], [3, 4]] is 2x2 tensor of integers (similarly one can have
["a", "b", "c"] as tensor of strings). Blob is an entity that stores arbitrary type of data, for example Tensor.
Models may be developed by one language(e.g. Python, R), however your production environment may use another(e.g. Java). One way to bridge the gap is to encode the models in a language/tool-neutral way:
- ONNX: A collaboration between Facebook and Microsoft. Supports Caffe2, PyTorch, and Cognitive Toolkit.
- PMML: XML
- PFA: YAML or JSON
Open Neural Network Exchange: AWS, Microsoft, Facebook
ONNX is intended to be a standardized format that will allow deep learning models trained on one framework to be transferred to another framework with minimal extra work.