Architecture for Data Exploration using Microsoft Azure

I have decided to write this blog post as quite a few of my clients are asking me how best to enable a data culture inside their organizations. Satya Nadella says: “this is where I think we as leaders of our businesses get to do the most transformative thing, which is to first make sure that everyone inside of the organization has the tools, has the capability to be able to gain these insights and then we empower them to act on those insights. This data culture is very much the journey that Microsoft itself is on.”

To enable a data culture, everyone must be able to access the data they need when they need it. For this, we need a BI architecture built for data exploration.

The diagram above illustrates the different approaches taken by traditional, deductive, BI versus the new trend of exploratory BI which is inductive. The usual approach for data warehousing project is to start from company strategies and then understand business requirements, such as produced by the balanced scorecard methodology. These requirements are then broken down into technical requirements such as KPIs that are implemented using ETL, DW, and OLAP technologies. This top-down, deductive, method is very good for descriptive and diagnostic analyses.

In our face-pace world, where agility is key there is a new way to do BI through inductive, bottom-up methods. This enables us to move up the value chain to predictive analytics and prescriptive analytics thus maximizing the value of an organization’s informational assets. The inductive approach starts from the unprocessed sources of information, co-located or not in a data lake, to enable business users to observe and play with the data. With the right tools they can then detect patterns, think of hypotheses and then confirm them.

Far from being an either/or, deductive BI and inductive BI complement each other and form part of a modern data warehouse strategy. Both set of technologies enable scenarios that are valuable for organization and the top-down authoritative metrics for an organization is not threaten by the need of flexibility and agility that an inductive approach allow. At Microsoft we believe that most organization will benefit from both of those type of analytics.

Using Cortana Analytics Suite components, we can assemble a solution that will enable inductive data exploration using the Azure Data Catalog, Azure Data Lake Store and HDInsight. The resulting architecture is shown in the picture below.