This was inspired by a bright high school student that emailed me for advice about his interest in deep learning.
Q: Hello Dr. Thomas! I’ve been trying to find good resources for deep learning, but the field does seem rather cryptic and a bit technically prohibitive for me at this point. If you wouldn’t mind, I had a couple of questions I’d love to ask you about learning deep learning:
- Is there a single book or a set that you’ve found that explains deep learning well? I’ve looked at ones like deeplearning.net or MIT’s free book, but all resources I’ve found are either too brief an introduction or wonderfully mathematically engaged but not applicable at all
- Do you think it’s a good idea for me to frontload mathematical rigor at this point, or should I wait until I’m further down the path to try to get the technical details down?
- When you take on a data science problem, how do you answer the classic “what to try next” question? For instance, sometimes on Kaggle problems, I’ll hit a wall where I don’t know what the best next move is.
A: Your assessment that most deep learning resources are either too brief or too mathematical is spot-on! My partner Jeremy Howard and I feel the same way, and we are working to create more practical resources. We will soon be producing a MOOC based on the in-person course we taught this autumn in collaboration with the Data Institute at USF. Until then, here are my recommendations:
In my opinion, the best existing resource is the Stanford CNN course. I recommend working through all the assignments.
Below are some of my favorite tutorials, blog posts, and videos for those getting started with Deep Learning:
Convolutions
- Image kernels explained visually - fantastic interactive visualizations of convolutions
- Understanding Convolutions - a deep dive into convolutions, looking not just at image processing applications, but a more general approach
- Jeremy’s Intro to convolutions video motivates and builds convolutions from scratch, as well as introducing max-pooling
- This blog post is a nice introductory overview to CNNs, including convolutions and max-pooling.
- A fascinating deep dive into a Kaggle winner’s solution to an image classification problem
Gradient Descent
- In his intro to DL and Stochastic Gradient Descent video, Jeremy uses Excel to clarify (and simplify) how SGD works.
- Here are detailed tutorials explaining and implementing in Python gradient descent and stochastic gradient descent.
RNNs
- Jeremy’s Intro to RNNs video is a code-filled introduction.
- Andrej Karpathy’s post on the unreasonable effectiveness of RNNs includes lots of fun examples.
Embeddings
- Chris Olah’s illuminating post on visualizations of language representations
As for your question about whether to front-load mathematical rigor, I think it’s good to focus on practical coding, since that way you can experiment and develop a good intuition and understanding of what you’re doing. Math is best learned on an as-needed basis - if you can’t understand something you’re trying to learn because math concepts are popping up you’re not familiar with, jump over to Khan Academy or to the absolutely beautiful 3 Blue 1 Brown Essence of Linear Algebra videos (great for visual thinkers) and get to work! Jeremy’s RNN tutorial above is nice example of a code-oriented approach to deep learning, although I know this can be hard given the existing resources.
It’s great that you’re doing Kaggle competitions. That is a fantastic way to learn–and to see if you understand the theory that you’re reading about. I’d have to know more about what you’re trying to know what to suggest next.