This post has been translated into Chinese here.
This week’s Ask-A-Data-Scientist column has a question from a college freshman at my alma mater, Swarthmore. Please email your data science related quandaries to mailto:[email protected]. Note that questions are edited for clarity and brevity. Previous posts include:
- How should I focus my study time? How can I find a specialty?
- Does Machine Learning as a Service (MLaaS) work? Do you need a PhD?
- How to change careers and become a data scientist
- How to structure your data science and engineering teams
- Advice to a student interested in deep learning
Q: I’m currently a freshman at Swarthmore College and I’m really interested in machine learning and deep learning. I wanted to take Artificial Intelligence this semester; unfortunately, no freshmen got into the class as it has been difficult for the CS department to keep up with the huge spike in interest.
I’m currently taking Andrew Ng’s Coursera Course on Machine Learning and will finish it in ~2-3 weeks. Next, I was planning on taking your fast.ai MOOC, which I saw on hacker news.
I know you may be too busy, but can I ask you questions I have about ML and my proposed plan? How can I continue to learn machine learning after Ng’s Coursera course and fast.ai? It seems like the only two options are 1.) research and 2.) graduate level courses at UPenn (which seem to be quite difficult to get into from Swarthmore (especially as a first-year student)). Any advice would be appreciated.
A: In general, I am happy to answer questions, although it may take me some time (my inbox, oh my inbox). For technical questions, it’s best to first ask on our fast.ai forums. There are tons of interesting discussions on our forums, even if you are not taking our course. For career-related or general questions, I often answer them in my ask-a-data-scientist column.
Even without Swarthmore or UPenn’s AI classes, you will never run out of things to do with deep learning or ways to learn more. Our MOOC takes 70 hours of study to complete, and if you get interested in any of the Kaggle competitions we have you start, you could spend much longer. We will be releasing Part 2 in a few months, which will be a similar time commitment, only with even more side avenues for further study, recommended papers to read, and ways to extend the work.
Take the official classes when/if you are able, but you don’t need the credentials or resources from official classes (to anyone out there not in university or at a university that doesn’t offer an AI class, don’t worry: you don’t need them!). One of our students, who was an econ major with no graduate degree, was just accepted to the prestigious Google Brain residency program! Another student developed a new fraud detection technique based on material from our course and has received a bonus at his job. Several others have received internship and job offers, or switched teams in their current workplaces to more exciting machine learning projects.
Credentials can sometimes be useful to get your foot in the door, particularly if you are an underrepresented minority in tech (and thus facing greater scrutiny).
However, there are lots of even more effective ways to get your name and work out there:
- Write a popular blog post (more on this below).
- Create an interesting app and put it online.
- Write helpful answers to others’ questions on the learn machine learning subreddit or on the fast.ai forums. Altruism is important to me, but that’s not why I recommend helping others. Explaining something you’ve learned to someone else is a key part of solidifying your own understanding.
- Do your own experiments, and share the results via a blogpost or github. One of our students, Slav Ivanov, asked about using different optimizers for style transfer. Jeremy suggested he try it out, and Slav wrote an excellent blog post on what he found. This post was very popular on reddit and made Slav’s work more widely known.
- Contribute to open source. Here, one of our students shares about his positive experience contributing to TensorFlow. With 3 lines of code, he reduced the binary size of TensorFlow on Android to less than 10MB!
In general, I recommend that you start a side project of something that interests you (that uses deep learning) so you will have that to work on.
Why you (yes, you) should blog
The top advice I would give my younger self would be to start blogging sooner. Here are some reasons to blog:
- It’s like a resume, only better. I know of a few people who have had blog posts lead to job offers!
- Helps you learn. Organizing knowledge always helps me synthesize my own ideas. One of the tests of whether you understand something is whether you can explain it to someone else. A blog post is a great way to do that.
- I’ve gotten invitations to conferences and invitations to speak from my blog posts. I was invited to the TensorFlow Dev Summit (which was awesome!) for writing a blog post about how I don’t like TensorFlow.
- Meet new people. I’ve met several people who have responded to blog posts I wrote.
- Saves time. Any time you answer a question multiple times through email, you should turn it into a blog post, which makes it easier for you to share the next time someone asks.
To inspire you, here are some sample blog posts from students in part 2 of our course:
- Linear Algebra Cheat Sheet for Deep Learning
- CNNs from Different Viewpoints
- Setting up a Deep Learning Machine in a Lazy yet Quick Way
- Non-artistic Style Transfer (or How to Draw Kanye using Captain Picard’s Face)
I enjoyed all of the above blog posts and also, I don’t think any of them are too intimidating. They’re meant to be accessible.
Tips for getting started blogging
Jeremy had been suggesting for years that I should start blogging, and I’d respond “I don’t have anything to say.” This wasn’t true. What I meant was that I didn’t feel confident, and I felt like the things I could write had already been written about by people with more expertise or better writing skills than me.
It turns out that is fine! Your posts don’t have to be earth-shattering or even novel to be read and shared. My writing skills were rather weak when I started (part of the reason I chose to study math and CS in college was because those courses requried the least amount of writing and also no labs), but my skills are improving with time.
Here are some more tips to help you start your first post:
- Make a list of links to other blog posts, articles, or studies that you like, and write brief summaries or highlight what you particularly like about them. Part of my first blog post came from my making just such a list, because I couldn’t believe more people hadn’t read the posts and articles that I thought were awesome.
- Summarize what you learned at a conference you attended, or in a class you are taking.
- Any email you’ve written twice should be a blog post. Now, if I’m asked a question that I think someone else would also be interested in, I try to write it up.
- Don’t be a perfectionist. I spent 9 months on my first blog post, it went viral, and I have repeatedly hit new lows in readership ever since then. One of my personal goals for 2017 is to post my writing quicker and not to obsess so much before I post, because it just builds up pressure and I end up writing less.
- You are best positioned to help people one step behind you. The material is still fresh in your mind. Many experts have forgotten what it was like to be a beginner (or an intermediate) and have forgotten why the topic is hard to understand when you first hear it. The context of your particular background, your particular style, and your knowledge level will give a different twist to what you’re writing about.
- What would have helped you a year ago? What would have helped you a week ago?
- If you are a woman in NYC, Chicago, or San Francisco, I recommend joining your local chapter of Write/Speak/Code, a group that encourages women software developers to write blog posts, speak at conferences, and contribute to open source.
- Get angry. The catalyst that finally got me to start writing was when someone famous said something that made me angry. So angry that I had to explain all the ways his thinking was wrong.
- If you’re wondering about the actual logistics, Medium makes it super simple to get started. Another option is to use Jekyll and Github pages. I can personally recommend both, as I have 2 blogs and use one for each.
You are on the right path by taking MOOCs, and by adding in a side project, involvement in online communities, and blogging you will have even more opportunities to learn and meet others!