**PyTorch **

“Optimized tensor library for deep learning using GPUs and CPUs”

- Based on Torch, developed by Facebook. Python API
- BoTorch library does Bayesian optimization (probabilistic models, optimizers, support for GPyTorch for deep kernel learning, multi-task GPs, and approximate inference)
- PyTorch defines a dynamic computational graph (can quickly and easily modify models)
- Takes advantage of Python’s native performance optimization
- Greg G: “my understanding is that PyTorch and TF do essentially the same thing well: automatic differentiation through arbitrary functions. I greatly prefer PyTorch because the API is much easier to use. That said, TF does have a large community and things like TF-probability, which supports probabilistic modeling. Edward is a different beast; it supports probabilistic modeling, so doing stuff like VI or HMC after you specify your model. Stan is similar, but I think it has more support. Anything that is gradient-based will be easy because auto diff. If you want more complex stuff, Uber AI’s Pyro is a probability framework built on PyTorch (it’s the equivalent of TF Probability).”
- From Archit: compared to TensorFlow, PyTorch has a bit more control and flexibility in how you do inference
- From Greg D: “My favorite by a good amount is PyTorch. Tensorflow is in second and edward is a distance third. I’ll start with the last. Edward is no longer supported and never had much active support or user community. Debugging is incredibly hard with cryptic error messages. Besides a few example models, it’s very difficult to implement custom models unless you’re an absolute edward expert. Even then it requires extending the language. PyTorch employs a dynamic computation graph, which means the computations that are executed can be determined at runtime, i.e., if the model itself is changing based on the inputs it’s easy to do. That also means it’s much easier to debug than Tensorflow because you can put in print statements everywhere and you can debug using the python debugger PDB. Tensorflow uses a static computation graph, which means the computations are effectively “compiled” before running a program. It makes it much more difficult to debug but the pro is that it’s more efficient and faster (in general, than PyTorch). Also, Caffe2 and PyTorch have now been integrated into the same tool, which is a plus for PyTorch. PyTorch and Tensorflow both have pretty active communities. And both have lots of models freely available on github. I’ve used PyTorch for anything from probabilistic models (like LMMs) to ML models like DNNs and generally anything that could benefit from autodiff. It’s also very easy to retrieve the gradients that PyTorch implicitly computes. You can use it to compute jacobians directly too. My rule of thumb is that whatever the community you’re working in uses the most is the thing to chose because the support will be the best and people will have run into the same problems as you and found solutions (e.g., stack overflow). Let me know if I could answer any more particular questions. Happy to help with pytorch if I can!”
- This article explains PyTorch inference in a very clear, accessible way : https://towardsdatascience.com/pytorch-autograd-understanding-the-heart-of-pytorchs-magic-2686cd94ec95

**Tensorflow**

“Provides a collection of workflows to develop and train models using Python, JavaScript, or Swift, and easily deploy in the cloud, on-prem, in the browser, or on-device” has GPU support

- High-level API based on Theano, developed by Google
- User has to manually encode distributed computation vs PyTorch
- TensorBoard, the visualization library used for debugging and training, is far superior to Pytorch’s Visdom
- “Eager execution” evaluates operations immediately– all functionality of host language is available while model is executing for natural control flow and simpler debugging
- Tensorflow Probability -”Python library built on Tensorflow that makes it easy to combine probabilistic models and deep learning on modern hardware (TPU, GPU).” Edward2 has been incorporated into this to allow deep probabilistic models, VI, and MCMC.
- Archit uses Tensorflow for spatial matrix factorization among other things. He says it is harder for inference to go wrong in Tensorflow because you have to define the entire computation graph of the model before running it (Tensorflow uses a static computational graph, although it has a way of implementing a dynamic one using another library)
- Andy used tensorflow to compute gradients to fit a genomics model and a computer vision deep learning model.

**Edward**

“A library for probabilistic modeling, inference, and criticism”

- Built on TensorFlow
- Supports modeling with directed graphical models, implicit generative models, NNs, bayesian nonparametrics and probabilistic programs.
- Supports inference with VI, MC, EM, ABC, and message passing
- Posterior predictive checks and point-based evaluations
- Archit and Allison use Edward. Archit says Edward is great to set up VI without having to write out KL divergence or reconstruction error yourself.
- Andy used Edward to fit a probabilistic model without having to write out all the variational updates
- Others in lab have stated that Edward is hard to use, but now that it’s integrated into TensorFlow Probability, this may be resolved

**Keras**

“High-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK mor Theano”

- Good for standard CNNs, RNNs, function approximation, etc
- Hard to adapt if you need to build custom architecture
- Niranjani has used Keras but states that she would switch to PyTorch in the future

Greg G also sent a link to a reddit post: https://old.reddit.com/r/MachineLearning/comments/emrzmb/r_d_tensorflow_vs_pytorch_for_research/