Microsoft Big Data Hackathon Resources

Data sets

ThinkData Works Data sets: https://namara.io/#/

The site consolidates data from open.data.ca, Statistics Canada, Provincial Data sources, GoodLife Fitness, SpotCrime and others. Here are the links to a few datasets of interest:

Category Details
Demographics

· Ward profiles: https://namara.io/#/display/8f403324-dcd4-4f3a-9838-f14171b1337a

· Well being Toronto:  https://namara.io/#/display/d7bc1cda-f93c-4886-92d2-431962cb18f6

· Labour force survey estimates (lfs), by sex and age group https://namara.io/#/display/461720d7-91a4-4ee7-a325-e2c56025d4bc

Transportation

· Road restrictions: https://namara.io/#/display/261e371e-2f7c-4bd2-8809-9f37a83c7fce

· Traffic and pedestrian volume: https://namara.io/#/display/a0ccf527-4021-4993-a485-d45da787b89d

· TTC stop times: https://namara.io/#/display/9139628b-6d32-497d-8009-b9ae0e89c52f

· TTC Trips: https://namara.io/#/display/25d4d4fe-0a2a-478a-beed-729a94450990

· TTC Stops: https://namara.io/#/display/bbf1ed02-03d0-4e4f-9188-190b05f80d5c

Social and community housing:

· Drop in locations: https://namara.io/#/display/4b379976-1e8a-4e5d-9208-c2d567f867e8

· Homeless shelters: https://namara.io/#/display/af527fa3-e936-4d46-b9c1-e6edc80e6f83

Development

· Active building permits: https://namara.io/#/display/31591bd0-9ce5-4f13-87aa-497f30fad217

Cultural / POI

· Cultural spaces: https://namara.io/#/display/a95c244c-9f11-499f-91c0-e8356ff6db0f

· Schools: https://namara.io/#/display/a2b31ef0-1840-4292-ab95-68fc2392a724

· Child care centres: https://namara.io/#/display/1a9d8825-fda8-4870-9a6b-fb2521628267 

· Early years centres:https://namara.io/#/display/a4a02423-68b6-4f07-a69e-96540194f9b1

· Parks: https://namara.io/#/display/735b7e24-0fdb-4918-b112-5958725410a7

· Places of interest: https://namara.io/#/display/389ea13c-ecf6-4f5c-9348-b9bc5d5f24bf

· Businesses: https://namara.io/#/display/d101e178-204b-4fd5-86b3-30bb8083d79f

Other

· Canadian Disaster Database: https://namara.io/#/display/9b024c88-bdd7-4bc4-929d-9f6f29952695

City of Toronto Open Data Catalog: https://www1.toronto.ca/wps/portal/contentonly?vgnextoid=7807e03bb8d1e310VgnVCM10000071d60f89RCRD

Canadian Government Open Data Portal: https://open.canada.ca/en 

Big collection of Data sources: https://mran.revolutionanalytics.com/documents/data/?utm_campaign=Data_Elixir_20&utm_medium=email&utm_source=Data%2BElixir

Finance, Economics and Society data: https://www.quandl.com/

You also can use Power Query to retrieve data from Facebook: please read article about it here

Example:

US/Canada Border Wait Times are available here

https://open.canada.ca/data/en/dataset/000fe5aa-1d77-42d1-bfe7-458c51dacfef

The data set is not large (around 1M records) and In itself is not very interesting – as analysis is pretty much limited to location and time - but if mangled with other widely available data sets, could be a basis for relatively interesting exploratory and predictive analysis.

You could integrate and correlate it with:

· Weather data from nearby weather stations: https://climate.weather.gc.ca/

· Canadian dollar exchange rates: https://www.canadianforex.ca/forex-tools/historical-rate-tools/historical-exchange-rates

· Fuel prices: https://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=3260009&paSer=&pattern=&stByVal=1&p1=1&p2=31&tabMode=dataTable&csid and https://www.energy.gov.on.ca/en/fuel-prices/

· Terror alert levels https://www.dhs.gov/how-do-i/check-national-terrorism-advisory-system-ntas

· …

Using these data sets you could perform both historical analysis (including geo-spatial visualizations) and attempt to build a predictive model.

Trial versions and subscriptions

· Office Professional Plus 2013 or Office 365 (we recommend to use Office 365 Pro Plus version)

· Excel Add-ons: Power Map, Power Query

· Azure ML Trial

Online trainings

· Getting Started with Microsoft Azure Machine Learning

· Faster Insights to Data with Power BI Jump Start

· Implementing Big Data Analysis

· Big Data Analytics

Other resources

· Custom Maps in Power MAP (Custom Maps work in Office 365 Pro Plus only)

· Canadian County and Postal Code Shading in Power Map for Excel