Azure Machine Learning Packages for Vision, Text and Forecasting in Public Preview

This post is authored by Matt Conners, Principal Program Manager, and Neta Haiby, Principal Program Manager at Microsoft.

Earlier today, at Build 2018, we made a set of Azure AI Platform announcements, including the public preview release of Azure Machine Learning Packages for Computer Vision, Text Analytics, and Forecasting. The Azure Machine Learning Packages are Python pip-installable extensions for Azure Machine Learning. The packages provide a wide range of functional APIs to innovative, complex, and cumbersome techniques that are useful to solving data science problems in the domains of vision, text, and forecasting. These high-level APIs boost productivity for data scientists and AI developers and help them build high quality accurate models by using the new algorithms built into these packages for tasks like feature generation, parameter tuning, and model selection. The Azure ML Packages enable rapid time to solution by abstracting the pain points involved in model creation, deployment, and management.

Additionally, the Azure ML Packages provide data scientists and AI developers with flexibility to utilize state of the art technologies by providing interoperability with common frameworks, such as keras, sckit-learn, Tensorflow, and CNTK.

Why Use Azure Machine Learning Packages?

There are many great reasons to use Azure Machine Learning Packages, but here are three:

  • Quality – Achieve high quality custom machine learning and deep learning models on the first successful run (out of the box).
  • Time to Solution – Go from a dataset to a high-quality model deployed into production easily and fast.
  • Flexibility – Support popular, state of the art tools and techniques relevant for vision, text, and forecasting domains.

AML Package for Vision

With Azure Machine Learning Package for Computer Vision, you can build, fine-tune, and deploy deep learning models for:

  • Image classification.
  • Object detection.
  • Image similarity.

What's Included in the API?

Dataset Creation

Easily connect and run your data set, input data can be stored as images on your local disk, downloaded using a list of URLs, or accessed in a dataset created by querying an Image Search API. The labels associated with these images can be added/modified through Python code or using the UI that runs inside the notebook.

Data Augmentation

Augment the datasets by transforming existing images based on a variety of image transformation techniques. Any image augmentation techniques supported by the imgaug library can be used.

Modeling and Training

Create DNN models, train, and score them. Users can fine tune and customize models or use a pretrained model for a transfer learning use case. The computer vision package supports the CNTK framework for classification and image similarity and Tensorflow for object detection.

Model Evaluation

Generate metrics by evaluating trained models on datasets. The Computer Vision package includes default parameters and algorithms that best fit the scenario. These selections were tested with various datasets to provide high-quality AI on first run. You can customize and fine-tune the parameters to improve the AI quality of your models.

Deployment

Deploy trained model to Azure Machine Learning Operationalization service. The model can be deployed in a local Docker container or cluster environment using the deployment modules of the computer vision package.

AML Package for Forecasting

Forecasting is the bedrock of effective management of a business, and the business impact of inaccurate forecasts can be substantial. On average, companies miss their forecasts by 13%, reducing shareholder value by 6% on average, according to the Harvard Business Review.

The Azure Machine Learning Package for Forecasting was created by Microsoft data scientists and engineers based on their learnings from solving hard financial and demand forecasting problems for customers including the Microsoft finance organization. The Azure Machine Learning Package for Forecasting enables data scientists to create and deploy accurate forecast models for scenarios including:

  • Financial Forecasting: Forecast financial metrics such as revenue, sales, or operating expenses.
  • Demand Forecasting: Forecast future customer demand for goods and services to help inform production and purchasing decisions.

What's Included in the API?

Time Series Data Preparation

Load temporal data and convert it to a time series data structure that enables convenient modeling of multiple time series (e.g. by store, by product) and is compatible with the rich capabilities of Pandas.

Exploratory Reports

Create a comprehensive report of your time series data frame, including both general data description and statistics specific to time series data to help inform what approaches you might consider for your forecasting problem.

Featurization

Automatically generate time series features from your data frame, including time index, lag features, rolling window features, and forecasts as features. You have the full flexibility to build your own features as well.

Modeling and Time Series Cross-Validation

Finance series are characterized by relatively small amounts of data and very high accuracy requirements. In addition to modeling with univariate time series methods, the Azure Machine Learning Package for Forecasting implements time-series cross-validation and hyper-parameter tuning that enable you to squeeze additional accuracy from your data.

Model Evaluation and Selection

Automatically calculate forecast errors including MASE and MAPE, and select the best model based on back tested performance of multiple forecast models.

Model Deployment

You can get the enterprise scale offered by Azure Machine Learning for operationalization of your forecast models as web services.

Azure Machine Learning Package for Text Analytics

The Azure Machine Learning Text Analytics Package is a Python package that simplifies the experience of building and deploying high quality machine learning and deep learning text analytics models in Azure Machine Learning.

Azure Machine Learning Text Analytics Package supports the following scenarios:

Text Classification

  • Binary, multi-class, or multi-label.
  • Scikit-learn traditional machine learning algorithms.
  • Keras/TensorFlow convolutional or recurrent neural networks.

Custom Entity Extraction

  • Keras/TensorFlow recurrent neural networks.
  • Conditional Random Fields (CRF) models.

Word Embedding

  • Word2Vec word embedding model training.
  • FastText word embedding model training.

What's Included in the API?

Feature Engineering

  • Word n-grams.
  • Character n-grams.
  • Word2Vec embedding features.
  • FastText embedding features.
  • Dictionary-based features.
  • Pre-trained model prediction as features.
  • Part-of-speech tags.
  • Orthographic (capitalization) features.
  • Embedding clustering.
  • And more.

Text Preprocessing

  • Tokenization.
  • Case normalization.
  • Lemmatization.
  • Stemming.
  • Stopword removal.
  • Numbers removal.
  • Special character removal.
  • Dictionary-based normalization.
  • And more.

Get the Azure Machine Learning Packages

Use the links below to learn more about Azure Machine Learning Packages today:

Matt & Neta