Jumpstarting Big Data Projects / Architectural Considerations of HDInsight Applications @ OOP 2015, TechEd Europe 2014 and PASS Summit 2014

Last week, my esteemed colleague Alexei Khalyako (from AzureCAT – the Azure Customer Advisory Team) and myself were speaking at the OOP 2015 – Software meets Business in Munich on jumpstarting Big Data projects. In fact, this session was also the foundation of our talks at TechEd Europe 2014 and PASS Summit 2014.

Jumpstarting Big Data Projects - small

Here, we were walking through the architectural considerations and decisions made in building an HDInsight solution. Short reminder: HDInsight is a Hadoop implementation as a platform as a service (PaaS) on Microsoft Azure. The HDInsight solution was to drive visitor experience and provide a personalised view using recommendations.

The session is structured along the typical Data Warehouse workflow:

architecture

As we go through every step we highlight the agony of choice between various technologies (both open source and Microsoft Azure services) especially in the big data space:

2 agony of choice

Mapping the technology options to each step within the data warehouse workflow:

3 architecture

And the final implementation workflow:

4 implementation

You can find the presentation also here on slideshare.