Neural networks for all
It took a 6th grade science fair for me to train my first neural network in a long while. My family lives in rural patch of Northern California filled with oaks, redwoods and other native flora. For the upcoming science fair my son decided to build a machine learning model to identify the local trees. We hiked on Saturday afternoon to take pictures and build a data set, while on Saturday evening we worked on a neural network model. The science fair is in two months and I figured we would need that time to build a decent model. As it turns out, I acutely underestimated how easy and accessible machine learning has become. Version 1 was complete after only a couple of hours with surprising accuracy. We used the same powerful AI engine as many of the leading AI research labs. We did the whole thing on a browser and it was free.
I first encountered neural networks in a college course on AI during the so-called AI winter of the late ‘90s. This was a low-point of enthusiasm for AI in general, and neural nets in particular. Despite theoretical advances in the ‘80s, commercial applicability was limited and relatively few research labs were focused on neural nets. Still, I was intrigued with biological approaches to AI and I experimented with different training approaches and applications. I built neural net models and coded the training algorithms mostly from scratch in the lisp programming language. Training could take hours and there wasn’t much training data available. For a time, I joined an AI lab at UC San Diego to explore neural networks further, but eventually I moved on. It would be another 15 years before neural nets really took off and the term “deep learning” came into wide usage.
Over the last decade or so, neural networks have entered a new renaissance powered by GPUs, cloud computing, massive data sets and continued advances in learning algorithms. Commercial applications have exploded. Applications include speech recognition (Siri, Alexa), translation (Google Translate), text generation (OpenAI’s GPT-3), image recognition (Tesla Autopilot, Adobe Sensei), image generation (Dall-E 2, Midjourney, Stable Diffusion), facial recognition (Apple Photos), video generation (Runway) and drug discovery. Some of the most impressive new applications are in generative AI, which is progressing at mind-boggling speed.
Fast forward to this past thanksgiving weekend, which spurred a few observations.
First, it is astonishingly easy to set up and train even fairly sophisticated machine learning models. You don’t need to code or to have a background in machine learning. There are many no-code and AutoML machine learning tools which automate common steps in the machine learning workflow. To stitch together your own workflows it helpful to have basic knowledge of python, a popular language in the machine learning community. My 12-year old son has been learning python for about two years. On his own initiative, he downloaded TensorFlow — the machine learning library that powers Google and OpenAI — and quickly had a sample model running on his laptop. Even easier than that, I found a tutorial on image classification that uses TensorFlow and runs on Google’s Colab environment. Colab allows you to build machine learning models in a web browser and execute them on Google’s GPUs for free, without downloading or installing any packages at at all. There are myriad tutorials and pre-trained models to start from.
Second, the rate limiting factors for deploying this technology are imagination for new real-world applications and quality data to train models. The problem of recognizing a type of tree based on an image falls under the domain of image classification. As this is a common application of machine learning, the most challenging step is assembling a robust data set. We couldn’t find a readily available dataset of local trees labeled with names so we created our own. A short hike with an iphone yielded about 250 images. This is tiny by today’s standards. The latest Stable Diffusion model is trained using the recently released LAION-5B dataset of 5.85 billion images from the web. As more data sets become public, the barriers to developing special purpose models will drop.
Third, there are now abundant solutions and resources for putting machine learning models into production (ML Ops). Just a few years ago it was a challenge to train a model and to deploy it at web speed and scale. Today, it’s relatively simple to deploy models in a browser, on a phone, on a server, or within an Enterprise workflow.
We have entered a golden age for neural networks. This is an incredibly exciting time to be an AI/deep learning researcher, an AI entrepreneur, or a student interested in AI. There are mature applications ready now, and a pipeline of emerging AI research that will power innovative applications for years to come. The accessibility of the technology means that you don’t need a research lab or capital to take part. It’s amazing how a concept can exist for decades (e.g., electric cars, neural nets) but not take off until conditions are just right.