September 15, 2021 Author: Matthew Renze

How do we build robots and self-driving cars with AI?

In my previous article in this multi-part series on The AI Developer’s Toolkit we learned how to compose AI models to build software applications.

However, there are many tasks that we would like to perform within the physical world using both hardware and software.

To do this, we need to construct a cyber-physical system – a machine where the hardware and software are deeply intertwined.

Cyber-Physical Systems

A cyber-physical system is a machine that can sense the world around it and choose actions to achieve a goal of some kind. For example:

  • a robotic vacuum cleans your floors
  • a collaborative robot works alongside humans
  • a self-driving car drives you to your destination

These cyber-physical systems combine a variety of sensors to sense the world around them. In addition, they include a variety of actuators to act upon the world around them. Essentially, sensors and actuators are the inputs and the outputs of our cyber-physical systems.

The Process

But how does this all work? To understand, we need to learn about the process that these cyber-physical systems use to choose actions given the state of their environment.

1. State
First, we have the current state of the environment. This is the state of the world that the AI system currently exists within.

2. Sensors
Next, we sense the world using a variety of sensors. Sensors convert the state of the world into digital data.

3. Data
The data from the sensors can be audio, images, and video — like we’ve previously seen. However, it can also be data from other types of sensors like velocity, radar, GPS, and more.

4. Perception
Then, we apply perceptual machine-learning models to the sensor data. These models, like we’ve already seen, convert this sensor data into features.

5. Features
Features are abstract representations of the physical world. We merge sensor data together in a process called sensor fusion. Then we represent these various features in what we call feature space. Think of it like the system’s mental model of the state of the world.

6. Planning
Next, we use a planning engine to develop a plan of action. This planning engine is often a combination of explicit programming and a type of machine learning called reinforcement learning.

7. Actions
Actions represent everything the system can possibly do. For example, accelerate, brake, turn left, or turn right. The chosen action is the one that maximizes the expected likelihood of achieving the system’s goal, subject to a set of constraints.

8. Actuators
Finally, the chosen action is fed to actuators. Actuators include motors and other outputs that can physically affect the state of the world. The actuators physically act upon the world with the appropriately chosen action.

9. Repeat
However, the changed state of the world also becomes the input for the next iteration of our function. So, we have to repeat this loop several times a second, choosing new actions with each iteration. This feedback loop makes things much more complicated than the simple input-to-output mappings we saw in the previous articles in this series.


If you’d like to learn how to build cyber-physical systems with AI, please watch my online course: The AI Developer’s Toolkit.

The future belongs who those who invest in AI today. Don’t get left behind!

Start Now!

Share this Article