Machine Learning Notes: fastai
(Please refer to Wow It Fits! — Secondhand Machine Learning.)
Compared with tensorflow
, mxnet
, paddle
or pure numpy
(just for the fun of it), torch
is probably the easiest Machine Learning package, and to get it even easier, let’s take a look at fastai
.
By the way, I subscribed GitHub Trending by RSS and the other day I got these two at the same time. Machine Learning in numpy
is really cool, but the second one is like, why? … these two target markets do NOT overlap.
Back to the topic, here’s my notes from fastai, fastbook (github.com) and Practical Deep Learning for Coders. The main idea of fastai
is to make Machine Learning accessible to every individual so that it can be applied to various subjects, and to do so:
- Use Google Colab, Kaggle, Paperspace, Hugging face.
- Only requires one GPU (or at least try to).
- fastai starts with higher architecture (see fastai - Quick start) and digs deeper so that it’s more customizable. Nonetheless, some higher architectures are useful that you will see it in almost any project that uses
fastai
, for example,DataLoaders
andLearners
.
Tabular Data
Random Forest is baseline, sometimes even the best method. See sklearn.ensemble.RandomForestClassifier — scikit-learn 1.3.0 documentation.
Transfer Learning for CV
See The best vision models for fine-tuning | Kaggle.
Two Categories
Sometimes predicting two categories at the same time has these two advantages:
- Parallel computing, which saves more time and computing sources.
- The categories might help each other. For example, predicting the fish and the boat may produce better results than only predicting the fish.
Learners
from fastai.vision.all import Learners
learn = Learners(...)
See fastai - Pytorch to fastai details. Here’s some useful tricks:
To find the best learning rate, see [1506.01186] Cyclical Learning Rates for Training Neural Networks:
learn.lr_find(suggest_funcs=(slide, valley))
To test the model with a dummy:
test_df = ... test_dl = learn.dls.test_dl(test_df) preds, _ = learn.get_preds(dl=test_dl)