Cloud4Good: Extracting Data from the Twitter API through Azure Functions

Azure Functions has quickly become one of my favorite technologies  in Azure. I find myself working a wide range of projects; with many of them being proof of concepts or hack-a-thons. With these formats, I am always looking for the fastest ways to develop the critical pieces of the software.

For the Missing Children Society of Canada we put together in a few short days a system to help identify people in distress via social media. The first implementation was based on someone using a hash tag on Twitter.

You can imagine that we needed to ensure that our application could handle the load that Twitter could generate. We build a pipeline from there, that would validate the Tweet, augment the data like adding critical information such as location data, and finally store the data in DocumentDB. Each step in this process was created as a queue backed Azure Function.

azure-functions-chart-mcsc

Filtering

Filtering the data was to ensure that only people who wanted to be found would be tracked. Ensuring that the person was within Canada and other types of data filtering are done at this step as well. This was to ensure that only the real messages would pass this step and be worked on further by the system.

Augmenting

Once the message has passed this validation additional information is added to the payload to ensure that there is enough information for a police office to work off of. The really critical data point here is the geolocation where the social media message was made. From there we can route the request for assistance to the correct police department.

module.exports = function
(context, message) {

    var rest = 'statuses/show/' + message.tweetid;

    twit.get(rest, { include_entities:
true }, function (error, tweets, response) {

        if (tweets.coordinates != null && 2 <= tweets.coordinates.coordinates.length) {

            message.latitude = tweets.coordinates.coordinates[0];

            message.longitude = tweets.coordinates.coordinates[1];

        }

    });

    context.bindings.out
= message;

    context.done();

};

Storage

Storing the data was straightforward at this point, as the bindings in Azure Functions really do most of the work.

{

"bindings": [

{

"queueName": "tostore",

"name": "in",

"type": "serviceBusTrigger",

"direction": "in",

"connection": "AzureWebJobsServiceBus",

"accessRights": "Manage"

},

{

"type": "documentDB",

"name": "out",

"databaseName": "missingdata",

"collectionName": "twitter-data",

"createIfNotExists": true,

"connection": "DocumentDB",

"direction": "out"

}

],

"disabled": false

}

You can see here, that Function is working off of a Service Bus Queue, and stores the data into a DocumentDB collection.

Git

While you are working on your Functions you can work in the Azure portal; or check your code into Git and have them updated on commit. I tend to do a lot of my debugging and playing in the portal; and then transfer the code into my Git repository. For me it enables me to work and tweak on the code, while watching the data flow through the system. Of course this is in situations that I am not working on production.

Wrap Up

Functions really have two major pieces that you can work  with; the bindings which setup where the data is coming and where it is going; and the actual code you want run. The composition of this really gives you freedom to focus your code on the actual work you need to do. I also find that during the development phase I can quickly change my mind about data storage or queue names other things that end up being really trivial to change on the fly.

 azure-functions-mcsc-app

Code Repository

A new organization was created and multiple repositories were added to facilitate multiple teams working on multiple areas of the project.  The org can be found at https://github.com/CDN-Missing-Children-Hack

azure-functions-mcsc-github-repo

Functions Takeaway

You don't even need an Azure account to get started with Azure Functions. Simply go to functions.azure.com and click "Try it for free" to start creating functions.