Does your OpsDB keep growing? Is your localizedtext table using all the space?


This post is about an issue in OpsMgr SP1 AND R2 – where the localizedtext table in the database may fill and consume large amounts of space.

OpsMgr 2007 no longer has a hard database limit of 30GB like MOM 2005 did.  For this reason, most OpsMgr administrators don’t watch this very closely anymore, or freak out when it gets big.

However – it must be noted… console and operational performance are still impacted when this DB gets big.  You really should keep an eye on it and try to keep it as small as possible.  In general, I recommend only keep 2 days of operational data (Database Grooming global setting) from the default of 7 days, until everything is tuned.

One thing I have noticed at several locations, is that there are a couple tables that often grow quite large… depending on the agent count and what management packs are installed.  These are LocalizedText and PublisherMessages.  This is cause by management packs, that create a large amount of events, from script.  I have seen this mostly in environments that have funky converted MOM 2005 MP’s what run a lot of backwards-compatibility scripts, or in large Exchange 2007 and SCCM deployments.  Like I said – this won’t affect all customers… just those with specific management packs that expose this.  What happens, is each event writes additional data to these tables, and they are not groomed or pruned…. so they keep growing.  Over time, the impact is, that your DB might keep filling and run of of disk space, or your performance might be impacted when you use a view that queries LocalizedText.

 

* Am I impacted by this issue? *

 

To know if you are impacted – I would run the following query against your OpsDB:

Simple query to display large tables, to determine what is taking up space in the database:

SELECT so.name,
8 * Sum(CASE WHEN si.indid IN (0, 1) THEN si.reserved END) AS data_kb,
Coalesce(8 * Sum(CASE WHEN si.indid NOT IN (0, 1, 255) THEN si.reserved END), 0) AS index_kb,
Coalesce(8 * Sum(CASE WHEN si.indid IN (255) THEN si.reserved END), 0) AS blob_kb
FROM dbo.sysobjects AS so JOIN dbo.sysindexes AS si ON (si.id = so.id)
WHERE ‘U’ = so.type GROUP BY so.name  ORDER BY data_kb DESC

Normally, in most typical environments with typical MP’s, we’d expect perf data to be the largest tables, followed by event, state, and alert.  If localizedtext is your largest table, this is impacting you.  You can run the following query:

select count(*) from localizedtext

Generally, if this table is your largest in the database, and over a million rows, you are impacted.  The impact is low… however…. mostly just hogging space in the DB, and possibly impacting console performance.

 

* OK – I am impacted.  What do I do? *

 

You need to run the attached SQL statements to clean this up.  You might need to run these on a regular basis (once a week to once a month) if it grows back quickly.  To run these – you open SQL Server Management Studio, connect to the SQL instance that hosts the OperationsManager DB, and run a “New Query”.  Then paste the text from one of the scripts attached into the query window, and run it.

When you upgrade to R2 – most of this is resolved…. we no longer fill this table, however, you WILL need to run the cleanup at least once to get rid of all the old junk leftover from SP1 days.

I am attaching TWO scripts below, which clean up these tables.  That being said – this script is NOT supported by Microsoft, as it has not been thoroughly tested.  It is being provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of Use

***UPDATED for R2:

There are now TWO scripts.

If you are on SP1 – you run both on a regular basis.

If you are on R2 – you only need to run the LocalizedTextCleanupforSP1.txt script ONCE, and then run the LocalizedTextCleanupSP1andR2.txt script on a regular basis.

This core issue was fixed in R2, however – since R2 released we found another type of data that gets left in the LocalizedText table, so this second script was developed.

 

*** Critical Note:

These scripts will require a LARGE amount of TempDB (mostly TempDBLog) space – make sure your TempDB is on a volume with lots of space to grow… if not – add an additional TempDB file on another volume just in case.  Make sure you take a good SQL backup of your OpsDB FIRST.  The script in general, takes about 20 minutes per million rows in the LocalizedText table, depending on the hardware capabilities of the SQL server.  I have seen it take 10 minutes per million rows on a fast server. 

Now – when I say LOTS of space for your tempDB – I mean it.  LOTS.  I believe it is the tempDBlog that needs most of the space.  Just make sure you have at least as much tempDB space as the size of your LocalizedText table.  That means if your LT table is 40 million rows (~40GB) then I would plan to have at LEAST 40GB of free space for your TempDB/TempDBLog to grow.  Changing the default autogrow on these to a larger value, and growing them out in advance will help speed up the process as well.

 

When the script is done, you wont recognize the space freed up immediately.  You need to run a – DBCC DBREINDEX (‘localizedtext’) – to reindex the table, and show the newly freed space.  It would likely be a good idea to reindex the entire database at this point, which you can do by running the following:

Reindex the database:

USE OperationsManager
go
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT ON
SET CONCAT_NULL_YIELDS_NULL ON
SET QUOTED_IDENTIFIER ON
SET NUMERIC_ROUNDABORT OFF
EXEC SP_MSForEachTable "Print ‘Reindexing ‘+’?’ DBCC DBREINDEX (‘?’)"

If you first want to troubleshoot, and try and determine what is consuming your tables… or which MP’s are generating the most noise in this table…. you can run the following (they might take a LONG time to complete – depending on how big your tables are:

Most common events:

select messageid, ltvalue, count(*) as Count from publishermessages with(nolock)
inner join localizedtext with(nolock)
on messagestringId = localizedtext.ltstringid
group by messageid, ltvalue
order by Count DESC

LT insertions per day/month:

SELECT
DATEPART(mm,timeadded) AS ‘MONTH’,
DATEPART(dd,timeadded) AS ‘DAY’,
count(*)
from localizedtext with(nolock)
group by
DATEPART(mm,timeadded),
DATEPART(dd,timeadded)
order by
DATEPART(mm,timeadded),
DATEPART(dd,timeadded)

LocalizedTextCleanup.zip

Comments (61)

  1. Kevin Holman says:

    @Hemant –

    Reindex is not a requriement – just required to recognize the free space, and good for performance since we deleted a lot of data, index is needed.

  2. Anonymous says:

    I had troubles with the Opmsgr Console performance at serveral clients. One of the most important things

  3. Kevin Holman says:

    1.  This issue is resolved in R2, for the most part.

    2.  By moving to the native Exchange MP for R2, you also remove this issue.  Microsoft HAS shipped the native Exchange MP.  However – due to it needing some native powershell modules that were added in R2, it is R2-only MP.

    3.  R2 is not a "rebuild" it is an upgrade, and installs just as most hotfixes do.

  4. Kevin Holman says:

    There is no LocalizedText table in the DW.

  5. Kevin Holman says:

    There is no requirement to stop any services in order to run the cleanup script.

    That said – if you have a large management group, with poor disk I/O on the backend, and your DB is always hammered…. the cleanup script will run faster if you disable the Health Service on the RMS and all management servers.  However – that is not a requirement, and essentially stops all monitoring, so I dont recommend that unless the script will not complete due to I/O issues.  But at that point – you have bigger issues with I/O than just localizedtext table size.  🙂

  6. Anonymous says:

    I upgraded my test lab from SP1 to R2-RTM this weekend.   My current test lab consists of the following

  7. Kevin Holman says:

    You PASTE the contents of these TXT files into a new SQL query window – targeting the OpsDB SQL instance.

  8. Kevin Holman says:

    @Alex –

    NO.  These scripts are only for OpsMgr 2007 DB's where this is an issue.  I do not expect to see Localizedtext filling the DB on SCOM 2012.  You should only run these on SCOM 2007 DB's, and only if impacted by the issue.

  9. Kevin Holman says:

    That is MOM2005 – not SCOM.  The script above ONLY applies to SCOM and I hope you didnt run it against your MOM 2005 Onepoint DB?

    I dont know what would cause that to be the top table consumer, or 29GB in size – but something is very wrong with that.  I would open a case with PSS support immediately.

  10. Kevin Holman says:

    11 days?  Ouch!

    I would be if it is running that long that you are being impacted by lots of TempDB growths… this will slow down the job massivley – you should pre-allocate space in this – and it will need a LOT!!!

    It also sounds like your database server might be resource constrained on disk I/O, CPU, and/or memory… the rule of thumb is 20 minutes per million rows – so if this cannot complete in ~20 hours then your SQL server is in bad shape.

  11. Kevin Holman says:

    Dont pay attention to how much it has deleted – the bulk of the job is creating a temp table in tempDB which will contain all the events we need to delete.  

    If you are over 1 hour per million rows as you say – I'd argue that it is likely your database is vastly undersized from an I/O perspective….  

    There are only two options to speed this process up other phan just plowing through it…

    1.  Stop the management server and RMS health, config, and SDK services, essentially shutting down the management group.  This speeds up the cleanup a lot in larger management groups allowing all DB activity to be reserved for the cleanup job.

    2.  Stop the MS and RMS servers, and then detach the database – and attach it elsewhere on a much faster and more powerfull SQL server.  You need a SQL server with REALLY good disk I/O, lots of CPU, enough memory to do the job (8GB should be fine but more is better), FAST disks where TempDB resides, and distinct disks for DB and tlog.

  12. Kevin Holman says:

    @Sander –

    Ugh.  We did change some things around alerts on CU3.  I sure hope we didnt break this cleanup script.  I will look into this as soon as I can and get back to you.

  13. Kevin Holman says:

    So, typically, I expect 20 minutes per million rows.  

    I would have estimated your cleanup to be around 7 hours, 14 MAX.

    This makes we wonder if you really have good disk I/O for the SCOM subsystem?  You are using a DL380 – with only 130 agents, how are the disks set up on this server?  Are you using RAID5?  Do you have the RMS and DB on the same server?

  14. Mark Verbaas says:

    Kevin, great post. We really had a big table. Unfortunately the first time we ran into diskspace issues. That’s how I figured out the errorprocessing part does not delete the first temporary table (#PublisherMessageReverseIndex).

  15. Anonymous says:

    did this on Management Server of Essentials Business Server becuase of problems with SCE 2007

    no Problems! took me 76 minutes to clean up 3.5 million lines on a Hyper-V VM.

    thx!!

  16. Anonymous says:

    If you’ve been reading this blog for any length of time then you’re already familiar with the postings

  17. Kevin Holman says:

    RE:  SCE –

    I believe they do work for SCE – but I am not 100% sure.  You would need to check with a SCE expert.

  18. Kevin Holman says:

    @Sander –

    I finished testing.  There is no issue and no repro.  I can run the scripts to cleanup LT just fine on my R2 CU3 database.  If yours are not working – I would have to assume you have corrupted something in your database outside the set DB schema.  Modifying ANY index on ANY table in OpsMgr is unsupported – and makes your database suspect.  Did you do any edits to the database ever, prioer to running the cleanup script?

    It is possible you are facing something previously unknown, but I can say with confidence it isnt CU3 at the root of the issue.  I would recommend a call to PSS support if you have a large LocalizedText table, need to clean it up, and the published scripts on my blog in the zip file are failing.

  19. Anonymous says:

    Do I ever need to be concerned about localizedtext tables in the OpsMgr DW? Or is only for OpsMgr DB?

    Thanks,

    Tom

  20. Kevin Holman says:

    Hi Lisa –

    Are you on SP1 or R2?

    How many rows remain in the LT table?

    Did you run both scripts or just one?

  21. Anonymous says:

    Lifesaver!

    Thankyou,

    John Bradshaw

  22. Kevin Holman says:

    Congrats!!!  365k rows is nothing!   Sounds like a much more performant database!

    Re: still seeing old stuff in LT table – IGNORE it.  You have "as good as you can get and be supported" the way it is now.  You CANNOT delete anything manually – as you can corrupt the database and any manual deletes are completely unsupported.  Only those specific scripts, run exactly as they are – are supported.  There are some old items that should be there by design, and we have also found some LT stuff that neither of our script catch… so it may be that this is "as good as it gets".  Believe me – it wont have a huge console performance impact unless it gets really, really big again.

    See my recent post on the OpsDB from a database perspective – I call out several other SQL related items that you can look at to see if something is awry in your environment.  Then be a hero and send the data on to your SCOM admins.  🙂

  23. Kevin Holman says:

    @mikasan –

    You have one bit confused.  Running the reindex does NOT shrink the actual file size for the database.  Once this is grown it should NEVER be shrunk to reclaim space – or you will fragement it and cause severe performance issues.

    Running the reindex forces the database to show the reclaimed USED space in the database (or show the real amount of FREE space after cleanup – however you want to look at it)

    There is nothing wrong with having a 47GB database file – with 8GB of USEED space.  Thats a good thing to have plenty of free space in the DB file.

  24. Anonymous says:

    I can confirm that this works for SCE as well.

    We’re running EBS 2008 (containing Exch2007, SCE & SQL Express 2005). We had the same issue and it is now resolved.

    I’ve posted my own findings of this issue on technet.

  25. Anonymous says:

    Thanks for the tip and I'd love to read the article but can't find it.  Would you mind replying with a link to that article?

    Thanks.

  26. Kevin Holman says:

    I have no idea what "WorkFlowTimestamps" table is, or how it relates to this issue.  Can you give more information on how you deduced that table name?  Was that in an error output or something?

  27. One of our databases had almost 20 million rows in localized text. Thanks for the post

  28. xacolabril says:

    Oh! Our OperationsManager database hd 18 milions of rows in LocalizedText and our performance is very poor.

    Thank you for the post too.

  29. zied says:

    we have 6 millions of rows in LocalizedText, the performance, alerts and events are poor…

  30. zied says:

    Thank you very very much for the informations, and for this wonderful Blog..

  31. Daniel Elliott says:

    Thank you Kevin,

    Our environment is large with 30 Exchange 2007 servers and 180,000 mailboxes.

    Finding this post has save me many hours of frustration, our DB was 70gb with 50 million entries in the localizedtext table. After running your scripts our DB is now 4gb and operating with far less noise.

    It would be good if this was resolved as I don’t really want to rebuild our SCOM environment with R2. Would also be nice if Microsoft release the native Exchange 2007 MP, two years after exchange 07 and we are still monitoring with a converted 03 exchange MP (poor form msoft).

    Once again, thanks Kevin!!

  32. Pascal Slijkerman says:

    Thanks for the script!

    Whe monitor about 130 agents with more then 25 MP’s. Whe had over 20.000.000 rows of localizedtext.

    The script took 27,5 hours!!! On a HP Proliant DL380 G5 with P500 raid controller and 512MB BBC.

    Our database was 26GB and has dropped to a 2.4GB

    Ít’s now nice and fast 🙂

  33. Gerardo says:

    Hi, may be you can help me. The result of the script was that WorkflowTimestamps table has 9 million of rows so i dont know how to reduce the space.FYI the grooming process finish ok. so what can i do?

  34. Gerardo says:

    Sory, I’ll try to explain better. I ran the "Simple query to display large tables"… and the result was (I’m showing only the top 10)

    WorkflowTimestamps 29979232 451864 0

    Event_13 95344 15344 0

    ProcessRuleComment 44120 392 0

    Event_15 40864 7048 0

    Event_14 28048 3888 0

    SampledNumericDataSource 20816 12856 0

    SampledNumericData_13 18744 9464 0

    Alert 10144 2432 0

    SampledNumericData_15 8264 4400 0

    Attribute 8184 1688 0

    so my Onepoint database is full and can not accept more data from clients. I suspect that WorkflowTimestamps is the problem.

  35. Ravi Gupta says:

    do we need to shutdown our SCOM application (by stopping all 3 services)to run the script (deletepublisher script) or can we run the script while SCOM is fully operational?

  36. raed fouad says:

    Dear Kevin,

    you will save my life since i have arround 40,000,000 row in that table, but i have windows 2008 and once i extracted the file its with .TXT extension. how i can run in on windows 2008 and wich extension i have to use, since i change the extension from .TXT to .VBS bnut it didnt work.

    waiting you reply and thanks again.

  37. raed fouad says:

    Dear Kevin;

    thanks alot for your help, im running it now.

  38. Timm says:

    Will these scripts work for SCE 2007?  Our OperationsMgr DB is over the 4096MB limit for SQL Express. LocalizedText table is 3GB.

  39. Keith Mescha says:

    We have over 78 million rows in our the cleanup script has been running for a 11 days already.

    ouch!

  40. Holland G Humphrey says:

    I just found this post because our nightly maintenance jobs on the OperationsManagerDB were failing because the reindex was taking so long it was getting deadlocked.

    Our LocalizedText table was over 20 million rows, and the script has been running for about 19 hours and hasn't deleted a million rows yet.  It keeps getting blocked by other processes from the console (SELECT queries).  How can I prevent the queries from blocking this so it can complete in a timely manner?

    Thanks.

  41. Holland G Humphrey says:

    The script finally completed…it took 28 hours and after I ran the 2nd script (we are still SP1) it now has 365K rows (it was over 20 million).  The reindex, which was not completing due to deadlock issues, completed in 16 minutes for the entire database (it was running for multiple hours before getting killed due to the deadlock).

    The server our operationsDB resides on is a big quad-core server with 64 GB of RAM.  Since we separate both the tempdb log and data files on their own 33 GB volume (attached to the SAN by dual fiber cards), the allocation of the 18 GB tempdb log file completed fairly quickly.  Had it not been for the locking/blocking I was seeing yesterday, it would have completed much faster.

    Now, Kevin, my next question…even though the LocalizedText table now has 365K rows, it still has data with a TimeAdded timestamp of 2008-08…why would it still have data for every month since August 2008 and not just for the last few months?  Is this a setting that needs to be set?  Can I run a DELETE query to delete all data in LocalizedText with a TimeAdded < '2010-01-01'?

    NOTE: I am the DBA for SCOM in our shop…not the SCOM admin, so I don't have any control over the console and frankly, the Admin, who is somebody offshore, doesn't even know I am helping him/her by handling this problem.  I'm sure they have seen poor performance, and in the past we have opened tickets with MS about the reason the data file kept growing, yet nobody from MS told us a bug existed that could cause this problem.  In fact, they told our SCOM admin at the time, that the fact our database was 60 GB was normal for a company like ours.  I knew at the time that was crap, because with MOM 2005, the operations database never got larger than 20 GB and if it did it meant we had a problem somewhere probably with the nightly load to the warehouse database.

    PS — Thanks so much for posting this blog article…it saved us.

  42. lisa says:

    Hello,

    I have executed the cleanup script on our R2 Operations Manager database and it only executes for approximiately 5 minutes with no errors.  Below is the output of the execution:

    (5326165 row(s) affected)

    Duplicate key was ignored.

    (72939 row(s) affected)

    (5326165 row(s) affected)

    (5326151 row(s) affected)

    (0 row(s) affected)

    (0 row(s) affected)

    I execute a reindex and the rows remain.  Any ideas of what I might have amiss is appreciated.

  43. lisa says:

    Hi Kevin,

    I am on R2, and 5,374,237 rows still exist.

    I only executed the LocalizedTextCleanupforSP1.txt script.  I did not attempt the other, but can certainly do so.

    Lisa

  44. lisa says:

    Hi Kevin,

    Wanted to let you know I executed the second script and it performed a cleanup of the localizedtext table.  Thanks for the help.

    Lisa

  45. ibulut says:

    if the first script fails, do I have to start from the begining or I can rerun the script and it will pickup from where it left?

  46. Sander says:

    When attempting to run the sp1 script on a database upgraded to R2/CU3, I get the following error:

    Error 8152: "String or binary data would be truncated"

    I believe that the contents of the LTValue table exceed the parameters of LTValueHash (VARBINARY(32)). I tried setting the value to VARBINARY(max), then I receive a different error:

    1919 "Column 'LTValueHash' in table '#LTHashStrings' is of a type that is invalid for use as a key column in an index."

    Any suggestions? Thanks.

  47. Sander says:

    Thanks Kevin. My bad for running the update prior to running the script.

    FYI, I tried removing 'LTValueHash' from the indexes, and changing value to VARCHAR(max), but got the same 8152 error. 26.4 MM rows in Localzedtext. Looking forward to your response!

  48. Sander says:

    When I first installed SCOM (~2 yrs ago), I had to manually update the RMS name to include the correct subdomain; otherwise I have not touched the DB. I'll see what PSS has to say and post what I find. Thanks for the feedback!

  49. Sander says:

    Thanks to Kevin and MS PSS for resolving my issue! It turns out that the Exchange 2010 Management Pack may create messages which cause rows in the LocalizedText Table to exceed 8000 bytes (4000 Unicode bytes); this condition will cause the grooming script to fail. PSS provided a modified script that ignores anything beyond 4000 Unicode bytes, and considers only the first 4k for a uniqueness comparison.  I ran the new script and my LocalizedText table went from ~23 MM rows to appox. 67k.

    This query will indicate whether your DB has the same problem mine did:

    Select * from LocalizedText where LEN(LTValue) > 4000

    If you encounter this issue, call PSS and reference case # 110120867964466 for the fix. Note, however, that my DB only had a single row which exceeded the threshold. If the query above returns multiple/duplicate rows, the fix may not apply.

  50. mikasan says:

    Hi Kevin, thanks for the script it actully free up my localizedtext which is 36million now down to 136thousand. I run the reindex to see whether the database would reflect the free space but no luck. What should I do so that the OperationsManager DB will reflect the new database size. The mdf is still showing 47GB but when I do full backup it's only 8GB. Is there anything I'm missing here?

    Thanks in advance.

  51. paul says:

    Thanks very much Kevin,

    I've just inherited an Operations Manager 2007 R2 setup impacted by this issue.Millions of rows in localizedtext.

    Console performance is terrible.

    Your blog is gold!

    cheers

    Paul, Senior DBA.

  52. Phil says:

    4 million rows down to 410k rows in 35 minutes

    EBS 2008 SCE database

    Thank you very much for the solution!

  53. Rob says:

    Same problem here… SCE2010.

    Only the row affected is EventParameter

  54. alex says:

    Hi Kevin, do we need to run these scrips against scom 2012 DB?

  55. Hemant says:

    Hi Kevin,

    Many thanks for the post and just wanted to know is indexing the table/DB must after the scripts are run? Or the mere purpose of indexing here would be to see the freed up space.

    Thanks in advance!

  56. Frans Ganzinga says:

    Hello Kevin,

    Thanks for the helpfull post! I ran the script successfully. However, when trying to reindex the database I get the following:

    Could not allocate a new page for database 'OperationsManager' because of insufficient disk space in filegroup 'PRIMARY'. Create the necessary space by dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

    This is probably because the database already way exceeded the 4GB limit of SQL Express..

    Any thoughts?

  57. Natarajan says:

    Scheduled reports are not delivered after running this script.As well data is not getting written in data warehouse db.Let me know for any work around

  58. John_Curtiss says:

    @anonymous – “I do not expect to see Localizedtext filling the DB on SCOM 2012.”

    and yet, I do see it. on SCOM 2012 R2 UR9. localizedtext is the largest table, and has 1,650,064 rows. is it safe to run these scripts? if not, what recourse do I have?

    1. Kevin Holman says:

      These scripts should not be run on SCOM 2012.

      1. Retep says:

        So if I have millions records in 2012 R2 (ur9) – how can we delete them then ?
        Peter