The Power of Human-in-the-Loop: Combine Human Intelligence with Machine Learning

This post is authored by Yunling Wang, Technical Program Manager in the Data Group at Microsoft.

At the Microsoft Machine Learning & Data Science Summit, we announced an exciting ‘human-in-the-loop’ product offering in partnership with CrowdFlower. CrowdFlower is a popular data enrichment and labeling platform that allows data scientist teams to use large-scale human intelligence to enrich and label their data. Through our partnership, we will bring tremendous opportunities for powering many human-in-the-loop workflows that combine human intelligence, powered by CrowdFlower, along with ML provided by Microsoft Azure Machine Learning.

In this post, we would like to provide an overview of what human-in-the-loop is, when and where it can be used, and also share our thoughts on some of the opportunities ahead in this exciting area.

What is Human-in-the-Loop?

At a high level, human-in-the-loop allows businesses to benefit from both the efficiency of ML as well as the quality of human judgements – machines can automate a majority of the work, while humans can come to assist when the machine is uncertain. The United States Postal Service, for instance, has been doing this for years, as evidenced by this excerpt from a New York Times article about postal workers who help the USPS decipher bad addresses that machines fail to recognize:

“At one time, there were 55 plants around the country where addresses rejected by machines were guessed at by workers aided with special software to get the mail where it was intended.”

Building these types of end-to-end, human-in-the-loop solutions can be very effective. But they are complex because they require a significant amount of time from data scientists to build ML models, on-demand human intelligence to assist when there is uncertainty in ML predictions, and connecting ML models with humans is not easy.

We are happy to say that we have greatly simplified this process, combining Cortana Intelligence’s automatic machine learning capabilities with CrowdFlower’s on-demand human intelligence platform. The picture below shows a simplified view of the process, which ultimately eliminates the requirement for data scientists and the need to manage the on-demand human workforce.


The ‘Human-in-the-Loop’ Process

As we walk through this diagram, you can see that a customer starts with a task. This task gets sent to CrowdFlower AI. CrowdFlower AI passes the task to a machine to see if it can solve it. If the machine can complete the task with high confidence, the answer gets sent directly back to the customer. Otherwise, it gets sent to a human to solve. At this point, the human response not only determines the final answer but it is also used to improve the ML model, enabling the machine to get smarter over time and increasingly take on more tasks that initially had to be delegated to humans.

When is Human-in-the-Loop Useful?

There are a very large number of scenarios where businesses can find productivity improvements or cost savings through the power of human-in-the-loop, here are a few examples:

When there are Class Imbalances, i.e. “Finding Needles in Haystacks”

Example: Detecting a forest fire from photos

There are many situations where what you are looking for is quite rare, such as detecting forest fires from photos of acres and acres of land. In such situations, ML can often know with confidence when there isn’t a fire (e.g. no colors that resemble fire or smoke), which can automatically reduce the number of photos that need to be examined manually. For a small subset of remaining cases, where machines cannot answer this question with a high level of confidence, humans can help resolve matters and, in doing so, retrain the ML model.

When the Cost of Errors is High

Example: HOV/carpool lane violations from photos

In certain scenarios, while ML may be quite reliable, the impact of even small errors may be too costly or have other deleterious side effects. Take the example of a city that decides to automatically detect carpool lane violations in situations where there were not enough passengers in a car. To prevent false positives, it would be safer to have humans verify how many passengers were in the vehicle prior to actually issuing tickets to violators.

When Human Annotations Are Already Used

Example: Resume classification

There are several scenarios where businesses already rely on humans to tag items manually. For instance, when screening resumes of candidates for interviews, or when evaluating support tickets to determine if they pertain to equipment safety. In such situations, it is worth investigating whether there could be a productivity boost or cost savings attained through a combination of ML and human-in-the-loop.

When There’s Little Data Available Today

Example: Classification of social media posts, for a new business

There are many scenarios where you will eventually want to use ML, but just do not have enough data today to get started. Take for instance the case of a new restaurant that wants to classify social media posts pertaining to themselves in a specific way, such as food quality, service quality, wait times, ambience, etc. In such a scenario, humans will make much better judgements in the early stages, but, over time, machines can learn and can take over the task.

When Generic Pre-Trained Models Exists, but Need to be Adapted to Custom Domains

Example: Sentiment classification, tailored to your business

There are many pre-trained ML models today which can be used for a wide variety of purposes. Take, for instance, the models available as part of our Cognitive Services, which can be used for face detection, sentiment extraction, image tagging, and so forth. In certain scenarios, these models can serve as a good starting point, and then be tailored for your business specific needs using human-in-the-loop, such as sentiment classification specifically for your own product line.

How Does the System Work?

At this point, you may be wondering what is happening behind the scenes. How are we automatically building models that can classify accurately without a data scientist? A lot of the work data scientists do to build machine learning models is what we call feature engineering, which involves taking the data and turning it into something meaningful for the algorithms to learn from. To make this simpler, we provide feature libraries to auto-featurize the data, saving data scientists significant amount of time on feature engineering. Auto-featurization is one of the many efforts that we are working on here in Azure Machine Learning, in order to democratize machine learning, and to simplify many of the steps that data scientists go through today.

To build the feature libraries for auto-featurization, we leverage algorithms from decades of Microsoft research in natural language processing, machine learning, computer vision, speech, big data and much more – the same algorithms that power products such as Bing, Cortana and Microsoft Office. Today we have started with a basic set of text featurizers, and we will be continually expanding the selection overtime. For example, coming soon, we will be adding support for deep neural network based featurizers. The power of these feature libraries is that they are usually trained on a large amount of data that is not available to most users (e.g. DNN image featurizer, trained on tens of millions of annotated images, or DSSM, trained on years of click data from Bing Ads and web search), and they save users weeks or months relative to training their own complex models. As a result, users can take advantage of these feature libraries and achieve great accuracy by learning from just a few hundred training examples.

What’s Available Today?

Our initial offering is called CrowdFlower AI Powered by Microsoft Azure Machine Learning. It is focused on text classification, enabling scenarios such as support ticket classification, social media post classification, resume classification, and much more. Customers are using this service for a variety of purposes, including:

  • Filtering large numbers of resumes down to a smaller set, to scan for potential interview candidates.
  • Identifying safety related issues in customer support tickets.
  • Classifying social media posts pertaining to their product or service.

Read more about the CrowdFlower AI offering at or from the Microsoft Analytics Partner site.

Where Do We See This Going?

We are excited about the new business opportunities enabled by CrowdFlower AI, but we are just beginning to unlock the full potential of human-in-the-loop applications. Imagine the possibilities across text processing, audio processing, image processing, and IoT signal processing, such as:

  • Traffic cameras that automatically detect HOV lane violations.
  • Fitness applications that automatically log your calorie count from pictures of the food you eat.
  • Security cameras that annotate the root cause of motion sensor triggers (e.g. whether it was an animal, human, falling leaves, a car driving by, etc.).
  • Text messaging apps that transcribe voice to text with high accuracy

We will share more developments with you as we continue to bring more scenarios to life.