BingMatrix – A Windows Azure application that provides a fun way to mine data from Bing

I wanted to share a little application I put together using Windows Azure. It uses Bing queries to find out how the popularity of a specific set of keywords on a specific set of sites. I actually created this for my own use while researching how frequently some registry keys are mentioned on Microsoft support, on TechNet blogs or on the TechNet forums.

To get started, go to https://bingmatrix.cloudapp.net and provide:

  1. Title (option)
  2. List of keyword
  3. List of web sites
  4. Additional keywords (optional)

Here’s a sample screenshot of the input screen:

image

You can use use one of the sample queries provided. For instance, to get the data above I simply clicked the “SMB2” button on the right. To get your results, click on the big “Build my BingMatrix” button on the left. Please note that it will take a few seconds to build the matrix, so be patient. Here are the results for the sample above:

image

For the specific example above, it searches the 4 keywords on the 5 different sites. To do that, it goes to Bing 20 times to get the results. For each one, it uses one keyword from the “keywords list”, one of the sites from the “sites list” and adds the additional keywords on the “additional keywords”. For instance, for the query on “Performance” on the “blogs” site, it passes the following query string to Bing: +"Performance" +"File Server" +SMB2 site:blogs.technet.com. The table with the results includes links to each individual query, so you can go directly to Bing to find the details.

Here are a few additional sample results:

image

image

You should interpret these results carefully, since they can vary widely depending on the additional keywords provided. Also, Bing is constantly crawling the Internet, so the output for he same query will change over time. For instance, the numbers you see on the screenshots above will probably be different by the time you try them out. It's also important to note that if you get millions of hits for a certain query, the numbers are obviously less precise. If you get just a few dozen, they are usually fairly accurate.

You can also provide direct links to a BingMatrix query. For instance, here are direct URL to the complete list of 12 sample queries provided in the main page of the site:

Try it out at https://bingmatrix.cloudapp.net and make sure to post a comment if you like it. You can obviously type in any keywords, sites or additional keywords to build your own matrix. Just keep in mind that if your matrix is too big, it will take longer to process. It might also time out. This was a weekend project for me and I have made a few updates in the last few weekends. Feel free to provide feedback and suggest improvements.


Updated on 12/15/2010: Deployed a new version that works faster by using multiple threads to query Bing.
Updated on 12/21/2010: New sample queries added (8 total now).
Updated on 12/23/2010: New sample queries added (12 total now). Support for passing in parameters in the URL. Added title field. Added “Building…” message.
Updated on 01/03/2010: More sample queries added (18 total now). Samples now come from a SQL Azure database.