Hadoop Summit kicked of today in San Jose, and T. K. Rengarajan, Microsoft Corporate Vice President of Data Platform, delivered a keynote presentation where he shared Microsoft’s approach to big data and the work we are doing to make Hadoop accessible in the cloud. At the event, we also announced that Azure HDInsight, our Hadoop-based service in the cloud, now supports Hadoop 2.4.
Investing in Hadoop
Hadoop is a cornerstone to our approach of making data work for everyone. As part of this bet we have fully embraced the Hadoop ecosystem and have prioritized contributing back to the community and Apache Hadoop-related projects e.g. Tez, Stinger and Hive. All told, we’ve contributed 30,000 lines of code and put in 10,000+ engineering hours to support these projects, including the porting of Hadoop to Windows. We’ve done this in partnership with Hortonworks, a relationship that ensures our Hadoop solutions are based on compatible implementations of Hadoop. One of the results of that partnership is the engineering work that has led to the Hortonworks Data Platform for Windows and Azure HDInsight.
The massive scale, power, elasticity and low cost of storage, makes the cloud the best place to deploy Hadoop. That’s one of the reasons we have invested heavily in our cloud-based Hadoop solution, Azure HDInsight, which combines the best of open source with the flexibility of cloud deployment. It’s also integrated with our business intelligence tools, enabling easy access and transformation of data from HDInsight to Excel and Power BI for Office 365.
Today we are providing an update to Azure HDInsight with support for Hadoop 2.4, the latest version of Hadoop. This update includes interactive querying with Hive using advancements based on SQL Server technology, which we are also contributing back to the Hadoop ecosystem through project Stinger. With this update to HDInsight, customers can use the speed and scale of the cloud to gain a 100x response time improvement.
HDInsight is just one part of our comprehensive data platform, which includes the building blocks customers need to process data anywhere it lives and in the format where it is born, whether they use Microsoft Intelligent Systems Service to capture machine-generated data within the Internet of Things, SQL Server or Azure SQL Database to store and retrieve data, Azure HDInsight to deploy and provision Hadoop clusters in the cloud, or Excel and Power BI for Office 365 to analyze and visualize data.