Wiki Life: How To Detect Missing Tags Without any Effort


Hi, welcome to my first blog post. Lying is useless, I know you’re puzzled about this title.

Here’s the story: a few weeks ago, I was discussing with Ed when he told me that a lot – sorry, A LOT – of articles in the TechNet Wiki don’t have any language tag (If you’re not aware of what a language tag or a tag is, I recommend you to read this Wiki page: Wiki: Common Tags).

Then, I start thinking about a way to detect this missing tag, about a way to help everyone to update articles without having to browse the 15.000+ articles actually hosted on the Wiki.

My initial plan

My initial plan was simple: hire an assistant, who will detect the missing tag(s) and update the articles. I quickly found this person: he was not complaining about work conditions, was very motivated, and he was typing on keyboard as no one. But I had some communication problems with him, and after verifying tags he entered like “l_çùlp” and “ù;:;k;ioyjk”, I discovered they were not in the Wiki: Common Tags list.

So, I had to forget this plan, as this guy named Marlon, Marlon Jester (my 11-months old son) was not reliable 🙂

My final plan

Finally, I decided to develop a tool for that: This tool – a PowerShell script – is also intended to detect missing tags on TechNet Wiki articles, and not only the language tags, but also the following tags:

  • Language missing (in the title – if not ‘en-US’- or in the tags list),
  • Has Comment,
  • Has Image,
  • Has Code,
  • Has Video,
  • Has TOC,
  • Has See Also,
  • Has Other Languages / Multi Language Wiki Articles.

The articles list and the associated missing tags are exported into a csv file, so you can easily see which articles you have to update and which tags you have to add.

The Brazilian, Chinese and Russian articles (hosted on their wiki or not) are excluded, as they have their own tagging rules.

How to use it

You will find a manual inside the zip file, so I will just explain here the major steps.

Data sources

The tool gathers data from 4 possible sources:

  • A specific URL,
  • The “ALL PAGES” tab,
  • The “NEW PAGES” tab,
  • The “UPDATED PAGES” tab.

 

For the 3 last choices, you have to enter as parameter a range of pages.

Export file format

The articles list and the associated missing tags are exported into a csv file, which contains 4 columns:

  • Title,
  • URL,
  • TagsToAdd,
  • Comments.

The “TagsToAdd” column contains the tags you should add to the article, here is an example for multiple articles (1 line = 1 article to update):

 

The “Comments” column contains informations which are non-relative to tags, like:

Things to keep in mind

During the last 2 weeks, I had a lot of activities, finished first in the last contributor awards; you have now discovered my secret, as 95% of my updates were coming from the csv file(s), during the testing phase.

What I want to tell you is not to consider this tool as a way to break records. You want to have more than 1.000 ‘activities completed’ in a week, you can do that ‘easily’ with the tool, but please remember that all of that is not and must never be a competition.

Take your time, launch the tool with a reasonable amount of pages, and we will all enjoy the result !

By the way, as Naomi told me yesterday: “when you are fixing the tags in the article using that list, don’t forget to actually read the article and fix all other things that may need to be fixed”.

Download link

The first version of this tool is available on the TechNet Gallery: Wiki TechNet: Detect Missing Tags.

Enjoy !

Benoît, The French Wiki Ninja

Comments (17)

  1. Congrats on your first blog post and thanks for sharing, it's Awesome !!!!!!

  2. Congratz for your first blog mate!

  3. That's a really good start for a first blog post – and a good news for everyone who spends his time to fix the tags.

    BTW: Today the Wiki has 14,274 articles. The "International Community Update" for October ( blogs.technet.com/…/friday-with-international-community-update-progress-in-each-language-oct-2013.aspx ) found about 13.000 articles with language tags. I.e. less than 1.200 article are missing a language tag. However, there are also a lot of other tags missing, and your tool will help to  them…

    It will be hard for your 2nd post to top this first one.

  4. Mehmet PARLAKYİĞİT-MTTC says:

    Congrats on your first blog post and thanks for sharing,

  5. Yagmoth555 says:

    Congrats on your first blog, and really good subject ! (I used to use Google, your method is really better 🙂 )

  6. Hezequias Vasconcelos - MTFC says:

    Congratulations Benoît.

  7. Great blog post. And I think Marlon is a future Guru, especially if he can type fast. He will learn to spell later. I need to check out the tool in the Gallery, because this sounds very useful. I never found a way to check for the absence of a tag.

  8. Benoit Jester says:

    Thank you all for your comments !

    @Carsten : I think I have to prepare my 2nd post carefully 🙂

    @Richard : for Marlon we'll see 🙂 You know, he' s already a Guru in his category … Shout …

    Concerning the tool, tell me if you have ideas to improve it, I have still some work to add/optimize the regex expressions.

  9. Hy Benoit,

    congratulations for the content of your first post on the blog, among other things you've listed a few tags I found it added to my wiki, thanks for the correction and support.

  10. Benoit Jester says:

    Hi Carmelo, can you send me an email with the URL and the incorrect tags, thanks !

  11. I think you need a different assistant! =^)

    Thanks, Benoit, for this amazing tool!

  12. Benoit Jester says:

    @Ed : Ah ah, I think you're right about this little assistant !

    Thanks for the link, glad to be immortalized  ! 😉

  13. Anonymous says:

    You can find the list of authors here:
    Wiki Ninjas Blog: The Contributors

    Anyone can join us

  14. Anonymous says:

    Hello!
    The names are listed in order of when they joined the TechNet Wiki Community Council. Since

  15. Anonymous says:

    This blog is a follow up to our last list of Wiki Ninja Blog Authors: Council Spotlight: Who are the

  16. Anonymous says:

    Hello Community!
    Today is Wiki Life and I’ll talk about the Blog for TechNet Wiki (Wiki Ninjas