SQL Server 2008 and Unstructured Data

SQL Server 2008 introduces two new options to store unstructured data, in addition to the current BLOB support we have in SQL Server 2005.

IMAGE

Back in SQL Server 2000 and SQL Server 7 days, we used the IMAGE data type to store binary large objects (a.k.a. BLOBs). Large, by the way, is defined as anything that would not fit a SQL Server data page (around 8,000 bytes). This data type is still present in SQL Server 2005 today, but it should be going away soon. The current recommendation is to “avoid using these data types in new development work, and plan to modify applications that currently use them.”  This applies also to the similar TEXT and NTEXT data types.

This is described in more detail at https://msdn2.microsoft.com/en-us/library/ms187993.aspx.

VARBINARY

To replace those, SQL Server 2005 introduced the concept of using the VARBINARY(MAX) data type. This variation of the VARBINARY(n) data type extends the usual limit of around 8,000 bytes and allows you store up to 2GB of data (you can store up to 2^31-1 bytes, to be precise). This applies also to the similar VARCHAR(MAX) and NVARCHAR(MAX) data types.

You can learn more about it at https://msdn2.microsoft.com/en-us/library/ms188362.aspx

FILESTREAM

A new option in SQL Server 2008 is to use a FILESTREAM attribute in a VARBINARY column. This allows you to store the actual data on the server’s file system, while managing and accessing it directly within the context of the database. Applications can try to accomplish this by themselves (store a file path in a VARCHAR field and store the file somewhere), but having this managed by SQL Server itself removes a lot of the complexity. Now SQL can maintain the integrity between records in the database and the files associated with those records.  The files are stored in the same server where the SQL Server 2008 database is.

Please note that this is not a new data type, but a new attribute of the existing VARBINARY(MAX) data type. This is important, because it makes it easier for applications to leverage the new feature, since it is provided as a familiar data type. To use you this, you need to enable it on the SQL instance (using “EXEC sp_filestream_configure”), create databases supporting it (using “CONTAINS FILESTREAM” in the “CREATE DATABASE”), use columns that leverage it (using the “FILESTREAM” attribute in a VARBINARY column on the “CREATE TABLE”). To take advantage of the alternate (file handle) access path, you can use specific commands to insert, update and delete data (using “GET_FILESTREAM_TRANSACTION_CONTEXT()” and calling the OpenSqlFilestream API).

You can check the pre-release documentation, including sample code, at https://technet.microsoft.com/en-us/library/bb895234(SQL.100).aspx

RBS

SQL Server 2008 also introduces a new client-side API that allows for storing BLOBs on a Remote BLOB store (RBS). An RBS provider is typically a separate box on same the network as the SQL Server. This is managed by SQL Server and integrity between the database records and the RBS external store is maintained. More information is expected with a future CTP release of SQL Server 2008, but there is already some content covering this new feature.

For additional details on RBS, check: