Top Python Libraries for Deep Learning, Natural Language Processing & Computer Vision
This article originally appeared on KDNuggets.com here. For more, visit https://www.kdnuggets.com/)
This article compiles the 30 top Python libraries for deep learning, natural language processing & computer vision, as best determined by KDnuggets staff.
In a previous post, we had a look at the top python libraries for data science, data visualization, and machine learning. This time, we look at the top libraries for deep learning, natural language processing, and computer vision. These categories really don't need any further clarification.
This separation and classification is arbitrary, in some instances more than others, but we have done our best to group tools together by intended use case, hoping this is most useful for readers.
Clearly not all NLP and CV work these days is performed using deep learning techniques, but as the trends move toward such techniques for state of the art results, we stand by this otherwise arbitrary categorization logic.
Our list is made up of libraries that our team decided together by consensus was representative of common and well-used Python libraries. Also, to be included a library must have a Github repository. The categories are in no particular order, and neither are the libraries included within each. We contemplated constructing an ordering arbitrarily by stars or some other metric, but decided against it in order not explicitly stray from placing any perceived value or importance of the libraries within. Their listing here, then, is purely random. Library descriptions are directly from the Github repositories, in some form or another.
Thanks again to Ahmed Anis for contributing to the collection of this data, and to the rest of the KDnuggets staff for their inputs, insights, and suggestions.
Note that the visualization below, by Gregory Piatetsky, represents each library by type, plots it by stars and contributors, and its symbol size is reflective of the number of commits the library has on Github on a logarithmic scale.
And, so without further ado, here are the 30 top Python libraries for deep learning, natural language processing & computer vision, as best determined by KDnuggets staff.
Deep Learning
1. TensorFlow
Stars: 149000, Commits: 97741, Contributors: 2754
TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.
2. Keras
Stars: 50000, Commits: 5349, Contributors: 864
Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow.
3. PyTorch
Stars: 43200, Commits: 30696, Contributors: 1619
Tensors and Dynamic neural networks in Python with strong GPU acceleration
4. fastai
Stars: 19800, Commits: 1450, Contributors: 607
fastai simplifies training fast and accurate neural nets using modern best practices
5. PyTorch Lightning
Stars: 9600, Commits: 3594, Contributors: 317
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
6. JAX
Stars: 10000, Commits: 5708, Contributors: 221
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
7. MXNet
Stars: 19100, Commits: 11387, Contributors: 839
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
8. Ignite
Stars: 3100, Commits: 747, Contributors: 112
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
Natural Language Processing
9. FastText
Stars: 21700, Commits: 379, Contributors: 47
fastText is a library for efficient learning of word representations and sentence classification.
10. spaCy
Stars: 17400, Commits: 11628, Contributors: 482
Industrial-strength Natural Language Processing (NLP) with Python and Cython
11. gensim
Stars: 11200, Commits: 4024, Contributors: 361
Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.
12. NLTK
Stars: 9300, Commits: 13990, Contributors: 319
NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing.
13. Datasets (Huggingface)
Stars: 4300, Commits: 568, Contributors: 64
Fast, efficient, open-access datasets and evaluation metrics for Natural Language Processing and more in PyTorch, TensorFlow, NumPy and Pandas
14. Tokenizers (Huggingface)
Stars: 3800, Commits: 1252, Contributors: 30
Fast State-of-the-Art Tokenizers optimized for Research and Production
15. Transformers (Huggingface)
Stars: 3500, Commits: 5480, Contributors: 585
Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
16. Stanza
Stars: 4800, Commits: 1514, Contributors: 19
Official Stanford NLP Python Library for Many Human Languages
17. TextBlob
Stars: 7300, Commits: 542, Contributors: 24
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
18. PyTorch-NLP
Stars: 1800, Commits: 442, Contributors: 15
Basic Utilities for PyTorch Natural Language Processing (NLP)
19. Textacy
Stars: 1500, Commits: 1324, Contributors: 23
A Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library.
20. Finetune
Stars: 626, Commits: 1405, Contributors: 13
Finetune is a library that allows users to leverage state-of-the-art pretrained NLP models for a wide variety of downstream tasks.
21. TextHero
Stars: 1900, Commits: 266, Contributors: 17
Text preprocessing, representation and visualization from zero to hero.
22. Spark NLP
Stars: 1700, Commits: 4363, Contributors: 50
Spark NLP is a Natural Language Processing library built on top of Apache Spark ML.
23. GluonNLP
Stars: 2200, Commits: 712, Contributors: 72
GluonNLP is a toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research.
Computer Vision
24. Pillow
Stars: 7800, Commits: 10799, Contributors: 303
Pillow is the friendly PIL fork. PIL is the Python Imaging Library.
25. OpenCV
Stars: 49600, Commits: 29453, Contributors: 1234
Open Source Computer Vision Library
26. scikit-image
Stars: 4000, Commits: 12352, Contributors: 403
Image processing in Python
27. Mahotas
Stars: 644, Commits: 1273, Contributors: 25
Mahotas is a library of fast computer vision algorithms (all implemented in C++ for speed) operating over numpy arrays.
28. Simple-CV
Stars: 2400, Commits: 2625, Contributors: 69
SimpleCV is a framework for Open Source Machine Vision, using OpenCV and the Python programming language.
29. GluonCV
Stars: 4300, Commits: 774, Contributors: 101
GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in computer vision.
30. Torchvision
Stars: 7500, Commits: 1286, Contributors: 334
The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision.
Stay in the loop.
Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.