This blog post is authored by Joseph Sirosh, Corporate Vice President of Information Management & Machine Learning at Microsoft.
We built Azure Machine Learning to democratize machine learning. We wanted to eliminate the heavy lifting involved in building and deploying machine learning technology and make it accessible to everybody. Supporting open source innovation and enabling breakthrough learning capabilities with big data were important. So were supporting community-driven development and the ability for developers to easily create and monetize cloud-hosted APIs and applications. Most importantly, we wanted our customers to easily leverage future advancements in data science.
And now that future is taking shape. Today, at Strata + Hadoop World, we are announcing the general availability release of Azure Machine Learning, a fully-managed, fully-supported service in the cloud. No software to download, no servers to manage – all you need to start doing data science is a browser and internet connectivity. This release is packed with game changing innovations – here are a few highlights:
Creating web services is now a lot easier. We completely revamped the process for creating web services. It is now far more intuitive to take a data science workflow and create an analytics web service from it – it takes only minutes. We even provide a ready-to-use Excel client into which you can plug in your own data to easily test your web service.
You can train/retrain through APIs. Also new in this release is programmatic access to refresh Azure Machine Learning models with new data. This capability lets you retrain a model periodically, for instance when new data becomes available. It also allows consumers of a model that you created to retrain the model with their own data. For example, now you can create an API in the marketplace for your customer, and provide support for the customer to update the model with fresh data.
Python is now supported, and so is R. Now you can use the Anaconda distribution of Python along with its rich ecosystem of libraries such as numpy, scipy, pandas, scikit-learn, etc. directly in Azure Machine Learning Studio. Python developers can easily build sophisticated analytics experiments and create web services in the cloud with a few clicks. You can do the same with R, and even build experiments/web services that compose Python, R and Microsoft’s machine learning algorithms in a single workflow. This is a boon for innovators who seek to leverage the rich open source libraries of these two ecosystems in application building.
“Big Learning” is now possible. Azure Machine Learning now supports Learning with Counts, a revolutionary feature transformation capability that allows efficient classification and regression with terabyte sized data sets. This new capability uses parallel mapreduce in Azure HDInsight to efficiently create reduced feature representations from big data. Using the transformed features and appropriate sampling, one can learn highly accurate predictive models using state of the art algorithms such as neural networks and boosted decision trees.
You can use finished web services on the Azure Store. We now have a set of web service applications available on the Azure Store for common machine learning applications. These include Recommendations, Anomaly Detection and Text Analytics. Any web site, phone app, or SaaS application can integrate these capabilities with a few lines of code. These are examples of powerful applications that data scientists can now create and publish to the Azure Machine Learning marketplace, and participate in the emerging data science economy
We added a new community gallery. This release includes a community-driven gallery that lets you discover and use interesting experiments authored by others. You can ask questions or post comments about experiments in the gallery or publish your own. You can share links to interesting experiments via social channels such as LinkedIn and Twitter. The gallery is a great way for users to get started with Azure Machine Learning and learn from others in the community.
But that’s not all. To ease the path for cloud-based data science, we have created a step-by-step guide for the Data Science journey from raw data to a consumable web service. We also added the ability to use great tools such as iPython Notebook and Python Tools for Visual Studio along with Azure Machine Learning. And there are new capabilities for data reading and transformation, a module for SQLite support, and new learning algorithms such as Quantile Regression. With the integration of these diverse capabilities, Azure Machine Learning is now the most comprehensive data science and machine learning service available.
Our customers continue to apply Azure Machine Learning in interesting business scenarios. For example, eSmart Systems of Norway is pioneering smart grid management using our tools. A traditional smart grid includes multiple data silos, including SCADA networks, building automation systems and substation meters. In this environment, it can be difficult to forecast consumption and prevent bottlenecks or outages. For a utility company, upgrading its entire infrastructure would be costly. Even when upgrades are made, e.g. new smart sensors or meters, data gets collected but is not readily accessible. eSmart Systems uses the Azure cloud platform to integrate and analyze usage data and create forecasts. Azure Machine Learning is the "brains" of the solution, running the data models for predictive analytics. The analytics are used to predict capacity problems and automatically control load in individual buildings.
Sigurd Seteklev, Chief Strategy Officer of eSmart Systems, says:
“For what we’re doing at eSmart, we needed a cloud solution because of the sheer volume of data being collected; if we were to do it on premise we’d need a lot of storage. We also do a lot of data crunching using Hadoop, which also requires a lot of infrastructure. What we really like about Azure Machine Learning, and Azure in general, is that everything we do is through services available in Azure and we don’t need to monitor virtual machines.”
Mendeley is another innovative customer. One of the biggest repositories of scientific research content in the world, Mendeley provides a global platform and social network to foster discovery and community collaboration. To improve the user experience, Mendeley was looking to anticipate the behavior of new users in their initial adoption and engagement phase. Within two weeks of implementing Azure Machine Learning, developers were able to create a predictive model that was 30 percent more accurate than an earlier model that had taken them months to develop on their own. Not only is Mendeley able to iterate and deploy models three to five times faster, they can pinpoint their users’ needs with much greater confidence.
“Azure Machine Learning allowed us to build a better model than our previous solution in a third of the time, reducing lead time from model evaluation to deployment down to zero as it’s automated.”, says Mendeley CTO Fernando Fanton. “The beauty of Azure Machine Learning is that it’s open, allowing easy integration via widely adopted technologies such as REST and Hive.”
Hundreds of Microsoft partners including Booz Allen Hamilton, Cognizant Technology Solutions, Dell and Infosys are using Azure Machine Learning to build innovative advanced analytics solutions for customers. Several of them offer Azure Machine Learning -based learning services to global data science communities. We will share more information about our partner organizations’ adoption of Azure Machine Learning in an upcoming blog post.
Also, as we announced earlier today, Informatica has joined our ecosystem of partners. The Informatica Cloud service allows customers to pull data from a variety of on-premises systems and the cloud – including from SaaS applications such as Salesforce.com, Workday, Marketo and more – into Azure Blob storage. Once the data is in Azure, it is readily accessible for processing and analytics using Azure Machine Learning. Learn more about Informatica Cloud and the Informatica Azure Blob connector here.
For those of you who have not yet experienced Azure Machine Learning first-hand, I encourage you to check out our offering today – it is free and easy for new users to get started, and no credit cards or Azure subscriptions are needed.
We believe Azure Machine Learning is a game changer. No other advanced analytics service comes close to the scope, openness and breadth of the offering, or the ability to leverage the cloud for easy application development and deployment. Together with other Azure big data services such as HDInsight, stream analytics offerings such as Azure Stream Analytics, data pipeline orchestration services such as Azure Data Factory, and business intelligence services such as Power BI, Azure Machine Learning enables businesses to wring value out of every byte of data that they store and process. The future is bright for a world optimized with data, insights and intelligence.
Follow me on Twitter
PS: If you are attending Strata this week, tune in to my talk on Cloud Machine Learning to learn about how businesses benefit when advanced analytics and the cloud come together.