Wiki life: Technet Wiki tagging, the ugly truth.


How much time do you spend on setting good tags on an article, when you're putting together a new Wiki article, or when you are reviewing existing TNWiki articles?
Seconds, minutes?

Have you ever had some bizarre suggestions when you tried to add your tags to your article?
In this blog post I'll provide some insights in the world of TechNet Wiki tagging.
And I'll provide some practical hints and tips to get it done correctly…

How do tags work?

Check the description on Wikipedia: http://en.wikipedia.org/wiki/Tag_(metadata).

"In information systems, a tag is a non-hierarchical keyword or term assigned to a piece of information (such as an Internet bookmark, digital image, or computer file). This kind of metadata helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system."

Usually a tag is used to assign categories to the article, to group it, to ease search and to ease management of articles…
But how does that reflect to TechNet Wiki?

Facts and figures

Let me kick off with some numbers.
(Note: Please realize numbers can vary on a daily basis, so you get a snapshot of the statistics today, but the overall picture stays the same over time.)

How many articles do we have at TechNet Wiki?
16.000 + (Hint: see the TNWiki featured articles page.)

How many tags do we have at TNWiki?
18741 (and growing)

Yes, 18K+ tags!

For your information the TechNet Wiki sitemap has the list of ALL tags.

WARNING: It's an awful large document that takes quite some time to load. (I've warned you!)

So there is something wrong here, we have more tags than articles…
Using some powershell scripting and the Technet Wiki sitemap I've been analyzing the numbers.

Deep dive on TechNet Wiki tags

By default the TechNet Wiki is providing you with the most popular tags used, the TNWIKI Tag Cloud. (https://aka.ms/TNWikiTagCloud)

The tag cloud contains the TOP 100 of most used tags:

What's behind the TNWiki tag cloud?

Top 3 popular tags:

  1. EN-US: 9840+ articles
  2. Has Image: 4500+
  3. Has TOC: 3400+ count

This top 100 does not count the hidden/deleted articles.

Just for your information, we have almost 5000 hidden/deleted articles, mostly spam, duplicates, violations of TOU (terms of use), and archived articles.

But there is more interesting stuff…

Top 10 Tag frequency graph

If you check out how frequent a tag is used, you'll get some astonishing numbers.

x Times Used Number of tags
1 12295
2 2292
3 973
4 525
5 329
6 190
7 194
8 125
9 116
10 100

In short, we have 12295 tags, only used 1 time.
2292 tags have been used twice.

Tag's length

If you check the statistics on tag length, it's getting bizarre.

 The longest tag is 173 characters.
The shortest tag… is 1 char.

We got more than half of the articles with tags longer than 14 characters.
Here you see how the tag length is spread:

 

Number of words per tag

Tag Count #Words in tag
5892 1
5573 2
3317 3
1383 4
755 5
419 6
258 7
168 8
112 9
68 10
64 11
50 12
25 13
23 14
10 15
15 16
12 17
10 18
2 19
1 20
5 21
4 22
3 23
1 24
1 25

If you want to analyse it yourself: I've shared the analysis data here (but can't guarantee a life-time availability).

TNWiki tags collection.xlsx

See also

If you need more help on using tags, check these resources:

Lessons learned

Back to the basics:

  • keep it simple.
  • The more you use a tag, the better they get.

Take aways: How to set tags properly

  • The power of the tag is in being NON-UNIQUE
    • Re-use tags as much as possible
    • A tag used one time is useless; it will not be found.
    • A unique tag has no efficiency.
  • Keep the tags short
    • Think of an article tag as a #hashtag
    • The less characters used, the better

  • Keep the tag word count down
    • By preference 1 word or 2 word tags
    • Only use more words if REALLY necessary
  • Check if there are similar tags in use already
  • Better multiple re-usable good keywords that one long one-time tag

Hints:

  • if you want to avoid the wrong tags being suggested in the Wiki editor: type your tag, and terminate with a comma.

I sincerely hope this helps you to be a better TechNet Wiki article writer, and to get more out of TNWiki.
Because using the tags properly is a responsibility we all need to take care of!

Comments (16)

  1. Nicely explained! Thanks Peter!

  2. Durval Ramos says:

    Peter, great post!

    I think we have more tags than articles for two reasons:

    – Defining Technologies for abbreviations and also their names (e.g.: "VB" and "Visual Basic")
    – Tag translation into several languages (e.g. In English: "Troubleshooting" and in Portuguese: "Solucionando Problemas" )

  3. I’ve been updating the article with graphs and stats… There you’ll see that the root cause of the large numbers is not double usage (like "VB" vs "Visual Basic"). If you would use "VB" and "Visual Basic" just more than one time, the total number will
    be dramatically less…

  4. Fantastic statistics! Thanks, Peter!

  5. Shanky_621 says:

    Great post Peter, I would like council members to enlighten us with such posts often…. I would be careful with Tags and would educate others as well

  6. Durval Ramos says:

    Peter, amazing work !!! Thanks for sharing.

  7. Naomi N says:

    Very interesting

  8. Awesome Peter! Those are very interesting statistics! Especially the statistics on the number of tags used once, and the number of tags with more than 2 or 3 words!

  9. Anonymous says:

    Vamos terminar esta semana com duas notícias para nossa Comunidade, que surgiram à partir

  10. pituach says:

    Nice statistics, Peter!
    * The tags allow users to find articles on a particular topic or specific issue. Article can be assigned to multiple tags of course. It is natural that new systems will have more tags than the number of publications. In fact if we go to extremism for the sake
    of the discussion, in a system with single article it is clear that the number of tags will be higher than the number of articles. In time when the number of articles growth then the ratio should be changed.
    ** Indexes articles (like "a list of all X articles") should reduce the need of the tags and therefore reduce the number of tags. Those articles should be linked from the front page.

  11. Anonymous says:

    Hello most valuable Wiki family,
    I am calling Wiki family because we are really very big family. In

  12. Thanks Peter. Very helpful.

  13. Anonymous says:

    When you’re publishing articles to TechNet Wiki, you’ll see some cases where the content you