The most frequent question I get from developers is: what is the best way to get into Machine Learning?
A few years back, my response was:
- Google for the best resources and learn
- Find problems at work and apply what you learn
Though that response was honest and sincere, I realized soon that it was taken as a motherhood statement. So as a more concrete reply, I narrated my path in an article: An Engineer’s Trek into Machine Learning, and I shared the link whenever someone asked.
I genuinely thought I have simplified it with that article but it was still not simple enough. The reason is fearful analysis paralysis:
The material available on the internet for learning ML is overwhelming.
Somehow it gives an impression that you must first master hard math, and that scares away people.
So, let me first say that I do NOT know what is the BEST way. but I know a reasonably good way. And starting on a reasonably good path is much better than looking for the best path.
If you are a developer and prefer learning by doing, here is a path that is optimized for speed in acquiring functioning knowledge. It first does a breadth-first scan of ML techniques, applies those to problems, and then explores deeper based on the problem at hand.
Step 1: Start with Kaggle micro-courses
Each of the Kaggle micro-course takes 3–5 hours. Start with these 5:
The data is often stored in data warehouses. The data is accessed and manipulated using SQL. The following two courses will get you up and running:
Move to Step 2, but return and do these 3 as you gain more experience:
Return to these when you feel the need to learn neural networks:
Step 2: Participate in Kaggle Competitions
Nope, I am not talking about those $$$ competitions. Checkout:
If you prefer a more structured approach, you can try:
By now you will have a good overview of techniques and some taste of applying them.
Step 3: Do an ML Crash Course or Bootcamp
The last two steps were designed to quickly get you up and running. Now is the time to scratch a bit more. Try Google’s ML crash course, or any other course for practitioners (e.g. Udemy DS/ML Bootcamp).
My advice: don’t do it passively. Apply it to a problem that you have at work or play. Maybe even try out $$$ Kaggle competitions.
Step 4: Andrew Ng’s ML Course
Many people use Andrew Ng’s ML course as the starting point. It has math but not too much. I highly recommend it. It will give you a peek into what is happening behind the Scikit Learn APIs you have used so often. The course uses Matlab, but some people have implemented the ML course assignments in Python too.
Step 5: Neural Networks
These 5 steps will give you a pretty good practitioner’s know-how to work on various ML problems. And that is what I recommend: now apply what you have learned.
There might come a time when you will feel that you don’t know enough. But by then you would know the field and yourself well enough to figure out your next step. Maybe dive into math with An Introduction to Statistical Learning, advanced courses at DeepLearning.ai, or NLP courses from HuggingFace.
Finally, on Jupyter Notebooks
While you do these courses and practice, you will be using Notebooks. But as software developers, you will also see the challenges in productionizing notebooks. This is a continuing debate:
My opinion: tools are designed for a purpose, and we should develop judgment for picking the best tool for the task at hand. I neither love nor hate notebooks, I just use them when convenient.
I hope you enjoyed this issue. Do let me know what path you took from developers to data science.