Web Services and Marketplaces Create a New Data Science Economy

This blog post is authored by Joseph Sirosh, Corporate Vice President of Machine Learning at Microsoft.

Yesterday, at Strata + Hadoop World, we announced the expansion of our data services with support of real-time analytics for Apache Hadoop in Azure HDInsight and new machine learning (ML) capabilities in the Azure Marketplace. Today, I would like to expand on the new ML capabilities that we announced and share how this is an important step in our journey to jump-start the new data science economy. I’ll also be speaking more about this in my keynote presentation tomorrow at Strata.

Data scientists and their management are often frustrated by just how little of their work makes it into production deployments. Consider this hypothetical, although not uncommon scenario. A data scientist and his team are asked to create a new sales prediction model that can be run whenever needed. The data scientists perfect the sales model using popular statistical modeling language, “R”. The new model is presented to management who want to get the model up and running right away as a web app and as a mobile client. Unfortunately, engineering is unable to deploy the model as they don’t have R and the only option is to convert it all to Java – something that will take months to get up and running. So the data scientists end up preparing a batch job to run R code and mail reports on a daily basis, leaving everyone unsatisfied.

Well, now there’s a better way, thanks to Azure Machine Learning.

We built Azure ML to empower data science with all the benefits of the cloud. Data scientists can bring R code and use Microsoft's world class ML algorithms in our web-based ML Studio. No software installs required for analysis or production – our browser UI works on any machine and operating system. Teams can collaborate in the cloud, share projects, experiment with world-class algorithms and include data from databases or blob storage. They can use enormous storage and compute resources in the cloud to develop the best models from their data, unrestrained by server or storage capacity.

Perhaps best of all, with just one-click, users can publish a web service with their data science code embedded in it. Data transformations and models can now run in a web service in the cloud – fully managed, secure, reliable, available, and callable from anywhere in the world.

These web service APIs can be invoked from Excel, as shown in this video, by using this simple plug-in. Now, instead of emailing reports, users can surprise management with cloud-hosted apps that are built in hours. Engineering can hook up APIs to any application easily and even create custom mobile apps. Users can publish as many web services as they like, test multiple models in production and update models with new data. The data science team just became several times more productive and engineering is happy because integration is so easy.

But wait, there's still more.

Imagine a data scientist hits upon that perfect idea for an intelligent web service that everyone else in the world should be building into their apps. Maybe it is a great forecasting method, or a new churn prediction technique, or a novel approach to pattern recognition. Data scientists can now build that web service in Azure ML, publish the ML web service on the Azure Marketplace and start charging for it in over one hundred currencies. Published APIs can be found via search engines. Anyone in the world can pay and subscribe to them and use them in their apps.

For the first time, data scientists can monetize their know-how and creativity just as app developers do. When this happens, we start changing the dynamics of the industry – essentially, data scientists are able to “self-publish” their domain expertise as cloud services which can then be made accessible to billions of users via smartphone apps that tap into those services.

The Azure Marketplace already has an emerging selection of such services. In just a couple of weeks, four of our data scientists published over 15 analytics APIs into the marketplace by wrapping functions from CRAN. Among others, these include APIs for forecasting, survival analysis and sentiment analysis.

Our marketplace has much more than basic analytics APIs. For example, we went and built a set of finished end-to-end ML applications, all using Azure ML, to solve specific business needs. These ML apps do not require a data scientist or ML expertise to use – the science is already baked into our solution. Users can just bring their own data and start using them. These include APIs for recommendations, items that are frequently bought together as well as anomaly detection to spot anomalous events in time-series data such as server telemetry.

A similar anomaly detection API is used by Sumo Logic, a cloud-based machine data analytics company. They have collaborated with Microsoft to bring metric-based anomaly detection capability to their customers. Our metric-based anomaly detection perfectly complements Sumo Logic's structure-based anomaly detection capabilities. Any Sumo Logic query which results in a numerical time-series now has a special “metric anomaly detection” button which sends the pre-aggregated time series data to Azure ML for analysis. The data is then annotated with labels provided by the Azure ML service indicating unusual spikes or level shifts. Sumo Logic is now offering this optional integration in a limited beta release.

Third parties too are starting to publish APIs into our marketplace. For instance, Versium, a predictive analytics startup, has published these three sophisticated customer scores, all based on public marketing data – Giving Score (which predicts customer propensity to donate), Green Score (predicts customer propensity to make environmentally conscious purchase decisions) and Wealth Score (helps companies estimate the net worth of customers and prospects). Versium offers these scores by analyzing and associating billions of LifeData® attributes and building predictive models using Azure ML.

Our marketplace also hosts a number of other exciting APIs that use ML, including the Bing Speech Recognition Control, Microsoft Translator, Bing Synonyms API and Bing Search API.

By bringing ML capabilities to the Azure Marketplace and making it easy for anyone to access, we are liberating data science from its confines. This two-minute video recaps how:

Get going today – sign up for Azure ML and try out some of our easy to use samples.

A new future for machine learning is being born in the cloud. 

Joseph
Follow me on Twitter.