How do we build AI applications?
All of the tools we’ve covered so far in this series on The AI Developer’s Toolkit solve relatively simple and isolated problems of perception and cognition. These stand-alone models are often powerful enough to add a new AI feature to an existing product or service.
However, many real-world problems require more complex solutions than just a simple mapping from one type of data to another. To accomplish these more-complex tasks, we’ll need to create modular AI applications.
When we’re building AI applications, we can combine the output of multiple models together to make more powerful predictions. This allows us to make predictions that were not possible using only a single type of input data or to improve the accuracy of a prediction by fusing together data from multiple overlapping sensory inputs.
For example, we could combine both images and audio from a web camera for user recognition. Using both a user’s visual appearance and their voice allows us to more accurately identify the user.
We can also chain together multiple models so the output from one model becomes the input to another model. This allows us to apply AI to multiple layers of data operating at various levels of abstraction.
For example, we can chain object detection and image classification together to perform “object classification”. The object detector locates the objects in the image. Next, we crop the images within each bounding box. Then, the image classifier classifies the type of object in each cropped image.
Many real-world AI applications combine and chain various models together to form an AI application pipeline. You can think of it like building AI systems with Lego building blocks. You match the outputs you have with the inputs you need. Then you wire these AI building block together with adapter code.
With modular AI, we’re wiring a bunch of small highly-specialized machine-learning models together to solve a specific problem. However, with end-to-end AI, we’re training a single large machine-learning model to perform all of the same tasks at once.
There are various pros and cons to each of these two approaches to creating AI applications. Modular AI systems are easier to create, maintain, and debug. However, they require domain expertise and lots of code to wire the modules together.
End-to-end applications are more efficient, more powerful, and less biased. However, they are much harder to create, require much more data, and are not transparent at all.
So my general advice is to always start with a modular AI application. Decompose the problem into parts, solve each smaller subproblem with a single model, then wire the parts together to create a full solution.
Once you’ve solved the problem using a modular approach, you can always upgrade to an end-to-end application later. However, you will now have the benefit of being able to use the modular application to help train, verify, and debug the end-to-end application.
If you’d like to learn how to build AI applications, please watch my online course: The AI Developer’s Toolkit.
The future belongs who those who invest in AI today. Don’t get left behind!