Cloud-Scale Text Classification with Convolutional Neural Networks on Microsoft Azure

This post is by Miguel Fierro, Ilia Karmanov, Thomas Delteil, Andreas Argyriou, and Max Kaznady, all Data Scientists at Microsoft. Natural Language Processing (NLP) is one of the fields in which deep learning has made significant progress. Specifically, the area of text classification, where the objective is to categorize documents, paragraphs or individual sentences into… Read more

Julia – A Fresh Approach to Numerical Computing

This post is authored by Viral B. Shah, co-creator of the Julia language and co-founder and CEO at Julia Computing, and Avik Sengupta, head of engineering at Julia Computing. The Julia language provides a fresh new approach to numerical computing, where there is no longer a compromise between performance and productivity. A high-level language that… Read more

New Year & New Updates to the Windows Data Science Virtual Machine

This post is authored by Gopi Kumar, Principal Program Manager in the Data Group at Microsoft. First of all, a big thank you to all users of the Data Science Virtual Machine (DSVM) for your tremendous response to our offering in 2016. We’re looking forward to a similarly great year in 2017. The new year… Read more

Hello 2017, and Recap of Top 10 Posts of 2016

As we kick off what will surely be another very exciting year of progress in artificial intelligence, machine learning and data science, we start with a quick recap of our “Top 10” most popular posts (based on aggregate readership) from the year just concluded. Here are the posts that had the most page views in… Read more

Exploring Azure Data with Apache Drill, Now Pre-Installed on the Microsoft Data Science Virtual Machine

This post is authored by Gopi Kumar, Principal Program Manager in Microsoft’s Data Group. We recently came across Apache Drill, a very interesting data analytics tool. The introduction page to Drill describes it well: “Drill is an Apache open-source SQL query engine for Big Data exploration. Drill is designed from the ground up to support… Read more

Using SQL Server 2016 with R Services for Campaign Optimization

This post is authored by Nagesh Pabbisetty, Partner Director of Program Management at Microsoft. We are happy to announce a new Campaign Optimization solution based on R Services in SQL Server 2016, designed to help customers apply machine learning to increase response rates from their leads. This post contains more information about this new solution…. Read more

New Additions to the Data Science Virtual Machine – Test Drive, Community Forums, Deep Learning

This post is authored by Paul Shealy, Senior Software Engineer, and Barnam Bora, Program Manager, at Microsoft. The Data Science Virtual Machine (DSVM) is a custom virtual machine image from Microsoft that comes pre-installed with popular data science tools for modeling and development activities. The DSVM is offered in both Windows and Linux editions. There’s… Read more

Introducing the Team Data Science Process from Microsoft

This post is by Jacob Spoelstra, Data Science Director, Hang Zhang, Senior Data Scientist Manager, and Gopi Kumar, Principal Program Manager in the Data Science Team at Microsoft. Are you building a data science team but unsure how to make the team productive? Are you concerned that the lack of collaboration or consistent processes could… Read more

Recent Updates to the Microsoft Data Science Virtual Machine

Posted by Gopi Kumar, Principal Program Manager in the Microsoft Data Group. It’s been over 9 months since we first released the Data Science Virtual Machine (DSVM), a custom virtual machine image we published in the Azure Marketplace with a host of popular data science tools pre-installed and pre-configured. We’ve made a few updates since… Read more

Using Microsoft R Server on a Single Machine for Experiments With 600M Taxi Rides

Re-posted from R-bloggers. The New York City taxi dataset is one of the largest publicly available datasets, with information about 1.1 billion NYC taxi rides. This dataset has been explored and visualized in a number of blog posts, using a variety of techniques and technologies (e.g., PostgreSQL, Apache Elastic Search). A recent blog post showed… Read more