Predictions for data in 2016

As the expert in our team on Machine Learning, which is all about making predictions from data, Harry has asked me to make some predictions for 2016. For this article, however, I’ll be relying on my experience and not an API built from a neural network!

Prediction 1: Machine learning is a good place to start

Arthur C Clarke once made the remark that any sufficiently advanced technology is like magic, but I would suggest that in 2016 some of this technology will disappear by just being ordinary and common place.

Let me give you an example – in 2001: A Space Odyssey, the on board computer HAL recognises Dr Bowman and says hello. It has taken a little longer, but Windows 10 has Hello which will log you in by recognising your face. This took a little longer not because recognising a face from pictures is hard, but because it works from a 3D camera so Windows knows your face by its depth of features and won’t accept a picture of you.

So what if you want to embed this sort of intelligence in your own applications?

Prediction 2: 2016 will be the year of R

R, the universal language for machine learning across all platforms that will be coming to you in SQL Server 2016, in Power BI, and in Visual Studio at some point. R is not only set based like SQL is, but has very rich and extensible functions for statistics and it can plot results:

Some data about Bill Gates…

image

A tiny bit of R…

img = maml.mapInputPort(1)
img_rgb = rgb(img$R, img$G, img$B, maxColorValue = 255)
dim(img_rgb) = c(160,160)
library(grid)
grid.raster(img_rgb)

And here’s Bill as a 160x 160 plot of that data:

image

Prediction 3: The importance of APIs

API’s will become ever more important as a way of stitching services together without exposing the code and data behind an application which is a good thing because it limits what data we choose to share (subject to the APIs being secure of course)

Prediction 4: Less of a prediction, more of a hope

I hope U-SQL will take off. In a world of many computer languages and technologies, particularly around big data, you might wonder how another one can make a difference, but actually that diversity indicates that there is no killer language for big data. So U-SQL is Microsoft’s attempt at getting past the problems of architecting and executing process to run against big data by doing nothing more than combining C# and SQL. C# is there to describe the data which is then analysed using familiar SQL.

Prediction 5: More integration and better tooling in the Cortana Analytics Suite

Following on from U-SQL, I think we’ll also see more integration and better tooling in the Cortana Analytics Suite. If you haven’t read up on this, it’s a collection of data related online services in Azure and Office 365 (PowerBI). Many of these, however, were developed independently and so what we will see in 2016 is the integration and extension of these services. Actually this is part of a continuing program, for example Stream Analytics can now call Machine Learning in line. Also, Azure Data Factory (ADF) is already aware of Azure Data Lake (ADL) . Next year, however, we’ll see even more improvements to this and to the tooling to make it easier and more agile. For example, ADF is still very code heavy and so everything needs to be hand crafted in json. While we now have nice ADF templates and solutions in Visual Studio, it has not got the simple to use drag and drop UI that we have in SQL Server Integration Services Informatica. I understand that this will change, but I don’t know when or to what. We’ll also see the launch of SQL Server 2016 and this is also an example of integration, both with R and with Hadoop via the Polybase technologies.

Oh and another example that just landed in my inbox a dedicated VM in the Azure gallery for Data Science.

Prediction 6: Human learning

Cortana is maturing to the point where there will be Microsoft certifications in this space, but this brings challenges as learning materials and the exams need to keep pace with new developments. What I can say for sure is I am trying out the beta exams in January and will keep you posted.

Prediction 7: SQL Bits

If you are serious about your data driven career, I’ll be seeing you at SQLBits on 4-7th May in Liverpool. And if not, at a SQL Relay, Saturday evening or something in 2016.