9 Things You Should Do To Optimize HBase Performance in HDInsight

Reposted from Channel 9 and the Azure HDInsight blog.

The Cortana Intelligence Suite is designed to help businesses turn their big data into intelligent insights and actions. An important component of the Suite is Azure HDInsight, an Apache Hadoop distribution powered by the cloud. HDInsight handles any amount of data, scaling from terabytes to petabytes on demand. It lets you spin up any number of nodes at any time and we charge only for the compute and storage that you use. HDInsight is essentially Microsoft’s offering of Apache Hadoop, Spark, R, HBase, and Storm cloud services, and made super easy.

Apache HBase is a fantastic high-end Open Source NoSQL BigData machine that’s built on Hadoop and modeled after Google BigTable. HBase provides random access and strong consistency for large amounts of unstructured and semi structured data in a schema-less database organized by column families. HBase gives you many options to get great performance in HDInsight.

In the Channel 9 video below, we discuss 9 things you can do to get great performance from your HBase HDInsight cluster:

A summary of these recommendations is also available from this earlier blog post on the same topic.

CIML Blog Team