Identify and Enumerate Publishing Pages on Site Collection

Hello,

last time i need to identify all Publishing Pages in a site collection and do some actions on all of them. Therefore i wrote a short tool for this.

To do this action you can normally use more than one way to reach the target. But in my case the site collection has a size of 60 GB and the code need to run over more then 26000 Pages.

So it makes a difference to have execution time of 8 hours or less than 2 hours. :-)

My first solution was to identify Publishing pages on their “Pages” Library. Such an implementation runs completely on WSS, so on Microsoft.SharePoint namespace and is not optimized.

 using (SPSite site = new SPSite(url))
 { 
     foreach (SPWeb web in site.AllWebs)
     {
         try
         {
             SPList list = web.Lists["Pages"];
             foreach(SPListItem item in list)
             {
                 // do my stuff on each Pages
                 // ...
             }
         }
         catch (exception e)
         {}
     }
 }

Disadvantage of using only WSS namespace

Multi-Languages SPWeb-object have difference Pages-Library names like (Pages, Seiten, Paginas, …), so your code need to be aware of each languages.

 

It’s more efficient to use the Publishing namespace: Microsoft.SharePoint.Publishing

 using (SPSite site = new SPSite(url))
 { 
     foreach (SPWeb web in site.AllWebs)
     {
         try
         {  
             if (PublishingWeb.IsPublishingWeb(web))
             {
                 PublishingWeb pubweb = PublishingWeb.GetPublishingWeb(web);
                 PublishingPageCollection pages = pubweb.GetPublishingPages();
     
                 foreach (PublishingPage page in pages)
                 {
                     // do some stuff 
                     // ...
                 }
             }
         }
         catch (exception e)
         {}
     }
 }

the second solution is faster &  multilingual.

regards

Patrick