Wiki Life: How To Detect Missing Tags Without any Effort

Hi, welcome to my first blog post. Lying is useless, I know you’re puzzled about this title.

Here’s the story: a few weeks ago, I was discussing with Ed when he told me that a lot - sorry, A LOT - of articles in the TechNet Wiki don’t have any language tag (If you’re not aware of what a language tag or a tag is, I recommend you to read this Wiki page: Wiki: Common Tags).

Then, I start thinking about a way to detect this missing tag, about a way to help everyone to update articles without having to browse the 15.000+ articles actually hosted on the Wiki.

My initial plan

My initial plan was simple: hire an assistant, who will detect the missing tag(s) and update the articles. I quickly found this person: he was not complaining about work conditions, was very motivated, and he was typing on keyboard as no one. But I had some communication problems with him, and after verifying tags he entered like “l_çùlp” and “ù;:;k;ioyjk”, I discovered they were not in the Wiki: Common Tags list.

So, I had to forget this plan, as this guy named Marlon, Marlon Jester (my 11-months old son) was not reliable :)

My final plan

Finally, I decided to develop a tool for that: This tool - a PowerShell script - is also intended to detect missing tags on TechNet Wiki articles, and not only the language tags, but also the following tags:

  • Language missing (in the title - if not ‘en-US’- or in the tags list),
  • Has Comment,
  • Has Image,
  • Has Code,
  • Has Video,
  • Has TOC,
  • Has See Also,
  • Has Other Languages / Multi Language Wiki Articles.

The articles list and the associated missing tags are exported into a csv file, so you can easily see which articles you have to update and which tags you have to add.

The Brazilian, Chinese and Russian articles (hosted on their wiki or not) are excluded, as they have their own tagging rules.

How to use it

You will find a manual inside the zip file, so I will just explain here the major steps.

Data sources

The tool gathers data from 4 possible sources:

  • A specific URL,
  • The “ALL PAGES” tab,
  • The “NEW PAGES” tab,
  • The “UPDATED PAGES” tab.

 

For the 3 last choices, you have to enter as parameter a range of pages.

Export file format

The articles list and the associated missing tags are exported into a csv file, which contains 4 columns:

  • Title,
  • URL,
  • TagsToAdd,
  • Comments.

The “TagsToAdd” column contains the tags you should add to the article, here is an example for multiple articles (1 line = 1 article to update):

 

The “Comments” column contains informations which are non-relative to tags, like:

Things to keep in mind

During the last 2 weeks, I had a lot of activities, finished first in the last contributor awards; you have now discovered my secret, as 95% of my updates were coming from the csv file(s), during the testing phase.

What I want to tell you is not to consider this tool as a way to break records. You want to have more than 1.000 ‘activities completed’ in a week, you can do that ‘easily’ with the tool, but please remember that all of that is not and must never be a competition.

Take your time, launch the tool with a reasonable amount of pages, and we will all enjoy the result !

By the way, as Naomi told me yesterday: "when you are fixing the tags in the article using that list, don't forget to actually read the article and fix all other things that may need to be fixed".

Download link

The first version of this tool is available on the TechNet Gallery: Wiki TechNet: Detect Missing Tags.

Enjoy !

Benoît, The French Wiki Ninja