Microsoft Big Data Hackathon Resources


Data sets

ThinkData Works Data sets: http://namara.io/#/
The site consolidates data from open.data.ca, Statistics Canada, Provincial Data sources, GoodLife Fitness, SpotCrime and others.  Here are the links to a few datasets of interest:

Category
Details
Demographics
 

·       Ward profiles: http://namara.io/#/display/8f403324-dcd4-4f3a-9838-f14171b1337a

·       Well being Toronto:  http://namara.io/#/display/d7bc1cda-f93c-4886-92d2-431962cb18f6

·       Labour force survey estimates (lfs), by sex and age group http://namara.io/#/display/461720d7-91a4-4ee7-a325-e2c56025d4bc

Transportation

·       Road restrictions: http://namara.io/#/display/261e371e-2f7c-4bd2-8809-9f37a83c7fce

·       Traffic and pedestrian volume: http://namara.io/#/display/a0ccf527-4021-4993-a485-d45da787b89d

·       TTC stop times: http://namara.io/#/display/9139628b-6d32-497d-8009-b9ae0e89c52f

·      TTC Trips: http://namara.io/#/display/25d4d4fe-0a2a-478a-beed-729a94450990

·       TTC Stops: http://namara.io/#/display/bbf1ed02-03d0-4e4f-9188-190b05f80d5c

Social and community housing:
 

·       Drop in locations: http://namara.io/#/display/4b379976-1e8a-4e5d-9208-c2d567f867e8

·       Homeless shelters: http://namara.io/#/display/af527fa3-e936-4d46-b9c1-e6edc80e6f83

Development

·       Active building permits: http://namara.io/#/display/31591bd0-9ce5-4f13-87aa-497f30fad217

Cultural / POI

·       Cultural spaces: http://namara.io/#/display/a95c244c-9f11-499f-91c0-e8356ff6db0f

·       Schools: http://namara.io/#/display/a2b31ef0-1840-4292-ab95-68fc2392a724

·       Child care centres: http://namara.io/#/display/1a9d8825-fda8-4870-9a6b-fb2521628267 

·       Early years centres:http://namara.io/#/display/a4a02423-68b6-4f07-a69e-96540194f9b1

·       Parks: http://namara.io/#/display/735b7e24-0fdb-4918-b112-5958725410a7

·       Places of interest: http://namara.io/#/display/389ea13c-ecf6-4f5c-9348-b9bc5d5f24bf

·       Businesses: http://namara.io/#/display/d101e178-204b-4fd5-86b3-30bb8083d79f

Other

·      Canadian Disaster Database: http://namara.io/#/display/9b024c88-bdd7-4bc4-929d-9f6f29952695

 
Canadian Government Open Data Portal: http://open.canada.ca/en 
Finance, Economics and Society data: https://www.quandl.com/
You also can use Power Query to retrieve data from Facebook: please read article about it here
 
Example:
US/Canada Border Wait Times are available here
The data set is not large (around 1M records)  and In itself is not very interesting – as analysis is pretty much limited to location and time – but if mangled with other widely available data sets, could be a basis for relatively interesting exploratory and predictive analysis.
You  could integrate and correlate it with:

·         Weather data from nearby weather stations: http://climate.weather.gc.ca/

·         Canadian dollar exchange rates: http://www.canadianforex.ca/forex-tools/historical-rate-tools/historical-exchange-rates

·         Fuel prices: http://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=3260009&paSer=&pattern=&stByVal=1&p1=1&p2=31&tabMode=dataTable&csid and http://www.energy.gov.on.ca/en/fuel-prices/

·         Terror alert levels http://www.dhs.gov/how-do-i/check-national-terrorism-advisory-system-ntas

·        

Using these data sets you could perform both historical analysis (including geo-spatial visualizations) and attempt to build a predictive model.

Trial versions and subscriptions

·         Office Professional Plus 2013 or Office 365 (we recommend to use Office 365 Pro Plus version)

·         Excel Add-ons: Power Map, Power Query

·         Azure ML Trial

Online trainings

·         Getting Started with Microsoft Azure Machine Learning

·         Faster Insights to Data with Power BI Jump Start

·         Implementing Big Data Analysis

·         Big Data Analytics

Other resources

·         Custom Maps in Power MAP (Custom Maps work in Office 365 Pro Plus only) 

·         Canadian County and Postal Code Shading in Power Map for Excel

 

Comments (3)

  1. Anonymous says:

    This blog post was created collaboratively by the winning team of the Big Data Hackathon in Data Visualization

  2. Anonymous says:

    Recent Releases and Announcements   ·          SQL 2012

  3. Anonymous says:

    The event is over! Congratulations to the winning teams! > Data Modelling Prize winner: Ontario Parking

Skip to main content