Reinforcement Learning Consulting and Development

What do you need help with?

Not sure? Scroll down...

What is RL?

Reinforcement learning automates strategic decisions that happen over time.

What Is Reinforcement Learning?

As we describe in our book, reinforcement learning (RL) is a sub-discipline of machine learning (ML) that specializes in teaching machines to execute multi-step, strategic decisions.

Traditional ML automates single decisions. But these decisions don’t have any context, nor do they operate over sequences. For example, a traditional recommendations algorithm recommends a single set of products and the algorithms are optimized to improve that single recommendation. But this is the wrong objective. You don’t want to optimize for single placements. You should be optimizing for increased engagement or higher profitability per customer, or whatever your business prioritizes.

RL allows you to train your models to do exactly that; optimize decisions over a period of time towards your organizations unique goal.

What is Reinforcement Learning Not?

RL is Not a Panacea

RL is very good at solving multi-step, sequential decision making problems. For example, YouTube were able to improve video recommendation performance, at the same time as reducing the number of data scientists required to implement such a solution. But industry is awash with “low-hanging fruit” that is best served with simple cloud native or ML solutions.

RL is More Than Games

The vast majority of examples of RL you can find on the internet are based upon OpenAI’s gym. This comes with many pre-baked RL examples, but unfortunately they are focussed towards academia. There are very few examples of industrial problems (except for robotics). But these applications exist. Take a look at our companion site to see many examples of the use of RL in business problems.

RL is Not Artificial Intelligence

Artificial intelligence (AI) is an academic discipline interested in developing algorithms that produce human-like behavior. Although RL lies at the heart of much of this research, it cannot solve all problems for all organizations in an instant. It still requires thorough research and engineering.

How Does RL Help?

Reinforcement learning (RL) helps businesses solve strategic decision making problems and are easily tied towards core business metrics.

RL is often described in terms of the domain. But in our experience developing RL solutions for organizations like Nestle and CMPC, given the right situation, there are a number of generic benefits that only RL can provide:

  • Optimizes the right thing: RL algorithms are directly tied to a business metrics via the reward function.
  • Uses context: The right decision now may not be the best decision in the future. RL can learn that subsequent actions may be different to those initially taken.
  • Learn how, not what: ML typically learns from discrete results; it doesn’t learn how to get there. RL learns how experts achieve a result by learning optimal strategies.
  • Strategies, not decisions: ML produces fixed decisions that do not consider future reactions. They are certainly sub-optimal given the high-level goals of the business. RL learns strategies, which encode how to achieve some future state for the organization. The resulting strategies may surprise you!

Reinforcement Learning for Leaders

Download your free chapter from Phil's book - Practical Reinforcement Learning

A Leaders Perspective of Reinforcement Learning

In this introductory video, Dr. Phil Winder, CEO of Winder Research spends 3 minutes, introducing RL. Watch this video if you want a quick overview of how you can use RL to improve your organizations' efficiencies, growth, and products.

Download Your Free Chapter - “Practical Reinforcement Learning”

We are delighted to offer you a complimentary chapter written by our company CEO and Leader, Dr. Phil Winder.

The free chapter will enable you to learn about:

  • What RL problems look like and how RL overcomes these within an organization
  • Proven RL organization implementation processes
  • Top tips for RL pre-production tooling and techniques

You can find out more about the book on the dedicated rl book website.

How do I get my copy?

Fill in the form opposite, and we will send you your free chapter on “Practical Reinforcement Learning” directly to your inbox. Please remember to check Spam and Junk folders if nothing arrives back.

What happens next?

What if you would like to learn more about RL and, or maybe data as a whole?

As a leader, you wear many ‘hats’ and, like every organization, no matter its' size, has daily, weekly, perhaps longer-term challenges around ‘sorting/cleaning/enhancing data. We listen and share in confidence to learn more about leaders’ needs and aspirations for their team, department, and organization.

We understand and have much in the way of insights to offer and can support you and your organization, no matter its' stage in life, shape, sector, or size. We uniquely work with all. (see our website to learn more).

Dr. Phil Winder will personally look to reach back over the coming days and answer any follow-up questions you may have.

The World's Best AI Companies

From startups to the world’s largest enterprises, companies trust Winder Research.

Reinforcement Learning Consulting Services

Winder Research helps companies build production-quality reinforcement learning products and platforms.

RL Consulting

Winder Research are industrially renowned experts in reinforcement learning (RL) and we can help you with your RL problem.

Companies like Nestle work with us to provide expertise where they need it most. Our consulting guidance helps you complete your project faster and to a higher quality that it would have been otherwise. Our flexibility allows us to integrate tightly with your ways of working.

Talk to Sales

RL Development

Winder Research predominantly works on projects that involve developing RL solutions for domain specific problems.

Take CMPC, who are one of the world’s largest paper manufacturers, as an example. They have a complicated paper bleaching process, which had some level of ML intervention to help optimise the process. However the ML model could not take the multi-step process into account; i.e. it could not learn that changing some parameter early in the process could catastrophically alter the end result.

We ran a POC that proved that RL was capable of learning these complex multi-step interactions and they are working on incorporating it into their production process.

We can help you too, no matter what industry you are in. We operate under all contract types, from fixed cost proof-of-concepts to ongoing time and materials expertise.

Whatever your business, we can help.

Talk to Sales
Winder Research’s MLOps implementation for Grafana - Courtesy of Grafana

RL Product Development

The team at Winder Research are experienced RL practitioners and researchers.

Vendors of RL products can take advantage of our expertise to help them deliver their product. People like Grafana did this to create their new ML-driven monitoring capability, which required designing a bespoke integrated MLOps solution from scratch. As leaders in this space We’ve also helped Modzy and grid.ai to build out their platforms and offerings.

Winder Research is able to deliver fully self-managed incremental product improvements. This alleviates the burden from your team and shortens development timelines. Our experts can also integrate tightly with your ways of working for a collaborative solution.

Talk to Sales

Selected Case Studies

Some of our most recent work. Find more in our portfolio.

How To Build a Robust ML Workflow With Pachyderm and Seldon

This article outlines the technical design behind the Pachyderm-Seldon Deploy integration available on GitHub and is intended to highlight the salient features of the demo. For an in depth overview watch the accompanying video on YouTube. Introduction Pachyderm and Seldon run on top of Kubernetes, a scalable orchestration system; here I explain their installation process, then I use an example use case to illustrate how to operate a release, rollback, fix, re-release cycle in a live ML deployment.

How We Built an MLOps Platform Into Grafana

Winder Research collaborated with Grafana Labs to help them build a Machine Learning (ML) capability into Grafana Cloud. A summary of this work includes: Product consultancy and positioning - delivering the best product and experience Design and architecture of MLOps backend - highly scalable - capable of running training jobs for thousands of customers Tight integration with Grafana - low integration costs - easy product enablement Grafana’s Need - Machine Learning Consultancy and Development Grafana Cloud is a successful cloud-native monitoring solution developed by Grafana Labs.

Improving Data Science Strategy at Neste

Winder Research helped Neste develop their data science strategy to nudge their data scientists to produce more secure, more robust, production ready products. The results of this work were: A unified company-wide data science strategy Simplified product development - “just follow the process” More robust, more secure products Decreased to-market time Our Client Neste is an energy company that focuses on renewables. The efficiency and optimization savings that machine learning, artificial intelligence and data science can provide play a key role in their strategy.