End-to-End Data Science Walkthrough with Spark 2.0 on Azure HDInsight Hadoop Clusters

This post is authored by Debraj GuhaThakurta, Senior Data Scientist, and Brad Severtson, Senior Content Developer, at Microsoft. The data scientists among you would have seen how Spark 2.0, which released in July 2016, offered several enhancements over Spark 1.6. These enhancements included: Easier ANSI SQL and more streamlined APIs. Improvements in the speeds of… Read more

Moving eBird to the Azure Cloud

Re-posted from the Azure Data Lake & HDInsight blog. Hosted by the Cornell Lab of Ornithology, eBird is a citizen science project that allows birders to submit observations to a central database. Birders seek to identify and record the birds that they discover, and can also report how much effort it took to find those… Read more

Introducing Microsoft R Server 9.0

This post is authored by Nagesh Pabbisetty, Partner Director of Program Management at Microsoft. To thrive in today’s data-driven world, businesses increasingly need more powerful analytics solutions to predict customer behavior and discover new opportunities. However, existing solutions often fail to deliver enough insights, fast enough. At Microsoft, we continue to invest deeply in advanced… Read more

Free Online Workshop on Cortana Intelligence Suite: Register Now!

Get Live, Step-by-Step Guidance from Microsoft Experts This post is authored by Matthew Calder, Senior Content Developer at Microsoft. Join us on Microsoft Virtual Academy on Tuesday December 6th 2016, from 9AM – 4PM Pacific, for an exciting look at the Cortana Intelligence Suite (CIS), and end your day with a fully working intelligent web… Read more

Microsoft R Server for HDInsight Now Generally Available

This post is by Nagesh Pabbisetty, Partner Director of Program Management. Since we released Microsoft R Server on HDInsight in preview in March 2016, customers have used it for predictive modeling, machine learning, and statistical analysis. With the general availability of R Server for HDInsight customers will be able to leverage the largest portable R-compatible… Read more

The Next Generation Database & Data Lake from Microsoft

Re-posted from the SQL Server blog. Earlier today, at the Connect() event, which is livestreaming globally from New York City, we announced the next generation of Microsoft SQL Server and Azure Data Lake, as well as many other exciting new capabilities to help developers build intelligent applications. Here’s a quick recap of the key announcements:… Read more

Reddit ‘Ask Me Anything’ Session on Big Data & Analytics

We’re excited to host a special Reddit Ask Me Anything (AMA) session next week, focused on big data & analytics. Join us in an interactive conversation with Microsoft engineers who are pushing the state-of-the-art in this space. The session will take place on /r/Azure on Thursday next week, November 17th, between 10AM and 2PM Pacific… Read more

Data Manipulation at Scale with Microsoft R Server & Spark on Azure HDInsight

Re-posted from the Revolutions blog. Dealing with distributed data and having to program concurrent systems is not always the easiest of tasks, and data scientists familiar with R are unlikely to have extensive experience with such systems. In such scenarios, Spark offers a very popular, intuitive distributed data processing platform, with R and Python APIs… Read more

9 Things You Should Do To Optimize HBase Performance in HDInsight

Reposted from Channel 9 and the Azure HDInsight blog. The Cortana Intelligence Suite is designed to help businesses turn their big data into intelligent insights and actions. An important component of the Suite is Azure HDInsight, an Apache Hadoop distribution powered by the cloud. HDInsight handles any amount of data, scaling from terabytes to petabytes on demand…. Read more

Applying Deep Learning at Cloud Scale, with Microsoft R Server & Azure Data Lake

This post is by Max Kaznady, Data Scientist, Miguel Fierro, Data Scientist, Richin Jain, Solution Architect, T. J. Hazen, Principal Data Scientist Manager, and Tao Wu, Principal Data Scientist Manager, all at Microsoft. Today’s businesses collect vast volumes of images, video, text and other types of data – data which can provide tremendous business value… Read more