Solving Three Common Manufacturing Problems With Reinforcement Learning

Discover how reinforcement learning can be used to solve three common manufacturing problems.

Like many industries, manufacturing is experiencing an explosion in both the growth of and access to data. The data is complex and multi-faceted, for example the data may originate from the production line, the environment, through usage, or even from users. When viewed in this light, the explosion is often called “big data” and the effect called smart manufacturing (USA) or industrie 4.0 (Germany).

The data must be acted upon to be useful. Doing this manually, by humans, is often time-consuming and inefficient. Machine learning (ML) and reinforcement learning (RL) algorithms can automate decisions that are being made from the data. These methodologies can be applied to deliver advanced manufacturing technologies, sustainable processes, and innovative products. These improvements are also implicitly linked to advancements in supply chain and inventory control technology, which I discuss in another article.

RL in particular, a technology that evaluates sequential processes to deliver optimal strategies, is exciting because it opens up new applications that are unsuitable for ML alone.

In this article I present recent, exciting research that demonstrate the applicability of RL to three manufacturing problems, which my team and I are able to deliver as point-solutions: scheduling, development and production, and assembly.

My New Book on Reinforcement Learning

Do you want to use RL in real-life, business applications? Do you want to learn the nitty-gritty? The best practices?

We've written a book for O'Reilly on Reinforcement Learning. It focuses on industrial RL, with lots of real-life examples and in-depth analysis.

Find out more on

Manufacturing Job Scheduling and Dispatching

Production scheduling problems can be split into configuration and objective constraints. Configuration includes the number of machines or lines involved in production and whether these are running single or multiple jobs. The scheduling objective depend on the requirements of the business, but many choose to optimize for a due date.

These problems are traditionally hard to solve, because the constraints are hard to solve and the dynamism makes a static heuristic inefficient. RL can overcome both of these issues and I present a selection of exciting examples below:

Product Development and Production

Machines are often used for manufacturing tasks that humans are not well suited for like when they require high precision, repeatability, or are in hazardous environments. Many of these tasks represent complex control tasks that are hard to program or have variables that are difficult to account for. For example, creating glass fibre for optic cables requires finesse and careful control that depend on the individual qualities of that glass, the furnace, the control mechanism, and the fibre requirements. RL is well suited to learning optimal control strategies for fibre creation.

Similar challenges exist in industries as diverse as textile manufacture, fermentation, and nanofabrication. In many heavily automated industries it is even possible to provide end-to-end product development, like for drug discovery. Below is a list of recent research to provide inspiration for using RL/ML as an integral part of your production:

Assembly and General Robotics Tasks

The final group of use cases I want to highlight is much more general than the previous. Using robots for manufacturing has been popular for years. But more recently there has been a trend towards using generalized, multi-purpose robots that can be re-programmed for a variety of tasks. Unfortunately, re-programming is a very labour-intensive and highly-skilled challenge, and even then, the generated policies are often sub-optimal due to limitations in sensing and control. Furthermore, there are many tasks that, at first sight, appear to be simple, but are in fact very difficult due to variations in starting states.

One great example of a difficult challenge is the insertion task. Here, a relatively simple robot has the task of inserting an object into a slot, like a bolt or an electronic component. But the underlying object may be slightly misplaced or manufactured incorrectly, and the robot needs to adapt to place the object in the right hole. It is incredibly difficult to develop static strategies for such a scenario. However, RL is well suited because the strategies are generalized across nearly all eventualities; placement is “intelligent”.

Below is a list of similar challenges that can be solved with RL:


In general, RL is best suited to tasks that are dynamic and where it is hard to define static control policies. Manufacturing has a wide range of these, and I have presented three general areas here. But there are undoubtedly more that are unique to your business and situation.

I see a trend towards increasing levels of automation that brings a level of sophistication that is not well suited to current tooling. RL provides an enticing solution to that problem as well, because it copes well with extremely complex domains.

My colleagues and I at Winder Research are here to help you find the best data-driven solution to your problem where RL may (or may not) be an integral part. Please contact us if you’d like to talk more about any of the topics in this article, or indeed any data-oriented problem that you have. And you can find out more about RL by reading our book.