SQL Server collation and MOSS 2007

Recently I've worked in some SharePoint RAP's and I've found something which is most common in SharePoint farms, customers share SQL Server to host several applications, since many times SharePoint is not considered as a critical application its databases are allocated in a shared SQL Server, unfortunatelly and eventhough customers know that this is not recommended because of performance issues they still doing it. Customer creates different SQL Server instances to host different applications, sometimes other applications uses a different collation to reach business needs. Therefore SQL Server engine uses a specific collation and every instance configured a different one.

First will define SQL Server Collation (http://msdn.microsoft.com/en-us/library/ms187582(v=SQL.105).aspx):

Collations specify the rules for how strings of character data are sorted and compared, based on the norms of particular languages and locales. For example, in an ORDER BY clause, an English speaker would expect the character string 'Chiapas' to come before 'Colima' in ascending order. However, a Spanish speaker in Mexico might expect words beginning with 'Ch' to appear at the end of a list of words starting with 'C'. Collations dictate these kinds of sorting and comparison rules. The Latin_1 General collation will sort 'Chiapas' before 'Colima' in an ORDER BY ASC clause, whereas the Traditional_Spanish collation will sort 'Chiapas' after 'Colima'.

When a collation is specified for non-Unicode character data, such as char, varchar, and text data, a particular code page is associated with the collation. For example, if a char column in a table is defined with the Latin1_General collation, the data in that column is interpreted and displayed by SQL Server using the 1252 code page. For more information about code pages and collations, see Code Page Architecture.

Multiple collations can use the same code page for non-Unicode data.

Collations specified for Unicode-only data, such as nchar, nvarchar, and nvarchar(max), do not have associated code pages. Unicode data handles most universal characters. For more information, see Working with Unicode Data.

SQL Server collation becomes an important thing when a SharePoint farm will be provisioned or when SharePoint admins are creating a backup/restore plan, I've seen a common problem when database restore tasks are implemented. What does it happen when a full backup is performed using CA or stsadm, what type of collation is saved, SQL's or SharePoint instance?, What does it happen when DBA installs a new SQL Server for testing and restoration processes and does not care about collation? worst what does it happen when a DBA role does not exist in the organization and SharePoint administrators have to install SQL Server and they do not know how to install it?

It does not seem to be important and install SharePoint or restore and entire SP Farm is so easy. Let me tell you that in my last experience a customer with a small SharePoint farm experienced this problem with SQL Server collation, there is no a dba and SharePoint admins do not have SQL knowledge, after a bad SharePoint workflow experience they had an issue with database available space and a custom application stopped working because of the lack of disk space. Finally they got fix the workflow issue but they did not have any SharePoint backup, some Site Collection backups were done in the past but they  were not certain that those backups were healthy, then they install a new test SharePoint farm using SQL Server defaults.

Well answering the previous questions using customer example every attempt to restore a database failed, if they tried usign CA or stsadm they got a message about a database object failure or if they try with SQL Server tools they got a database schema inconsistency.

1. When a SharePoint database backup is done the collation saved is the one to refers to the SharePoint instance not SQL Server engine.

2. If DBA installs SQL Server by default without paying attention to collation already configured for SharePoint databases, we cannot restore the SharePoint database backup.

3. If SQL Server collation is not set properly, restore tasks will not work.

The SQL Server database collation must be configured for case-insensitive, accent-sensitive, Kana-sensitive, and width-sensitive. This is to ensure file name uniqueness consistent with the Windows operating system (http://technet.microsoft.com/en-us/library/cc288970(office.12).aspx), however, we do not support changing the default collation (Latin1_General_CI_AS_KS_WS) for SharePoint databases to any other collations (CI, AS, KS, WS). We support any CI collation for the SQL instance (for master, tempdb databases). However we recommend using SQL_Latin1_General_CP1_CI_AS as the instance (master, tempdb databases) collation since that’s where the bulk of our testing has been (http://support.microsoft.com/kb/2008668).

Note: If SQL Server collation is not Case-Insensitive and Non-Binary comparisons SharePoint farm cannot be installed.

Hope this help you when you're planning to deploy a SharePoint farm or building backup/restore plans, before document SQL Server configurations and do not forget that sharing SQL Server for SharePoint is not recommended.

Comments (0)

Skip to main content