Insufficient data from Andrew Fryer

The place where I page to when my brain is full up of stuff about the Microsoft platform

SQL Server 2008 Filestream FAQs

I get a lot of questions about how filestream works with the other new features in SQL Server, so here’s what I know.

First for those that don’t know, filestream is a new data type in SQL server 2008 that  handles large unstructured data files like they are in the database. The clever bit is that it also allows them to be streamed out very quickly using win32 api as though they were files on a file system. 

This is done by creating a special filestream filegroup, which actually creates a directory tree for the files to be stored but they now have filenames of guids controlled by SQL server. Filestream thus differs from Varbinary(max) where the data is chopped up into SQL Server 8k pages which then have to be stuck back together for retrieval. 

General FAQs:   

How do Transactions work?  You can actually roll back a transaction where you have attempted to update a filestream object and it will revert to the previous version.  This is because the update will copy in the new object and preserve the original until you commit.  Needless to say lots of disk space could be needed here for example if your are storing HD videos in your database. 

How are Backups affected? Backing up and restoring the database is exactly the same. This is important as it allows the structured part of the row to be exactly in step with the filestream object after a restore.  Additionally because it is filegroup it can be backed up by itself or excluded form other backups as required.

What if I need my Filestream data in a high availability scenario?

  • Mirroring. Most importantly Mirroring isn’t supported (so no filestream filegroup will be allowed on the principal or mirror), so either you continue to use Varbinary(max) in the same way as in SQL Server 2005 for your unstructured files, or you go for…
  • Clustering. This is supported, but the filestream filegroup must be on a shared disk (so maybe you have to rely on your storage to give you high availability here). I would add that 2 node clusters are allowed in standard edition.
  • Log shipping  is also supported, but both ends must be running SQL Server 2008 with Filestream turned on.

SQL Server 2008 interoperability with other new features..

What about backup compression? Backup compression doesn’t work with filestream.  generally the files will already be compressed (think jpeg, divx etc.)

What about Transparent Data Encryption? TDE doesn’t apply to filestream and in any case is only in enterprise edition.  If you need to protect the files in a filestream filegroup use the Encrypted Files System (EFS) .btw filestream filegroups can only be created on NTFS volumes.

Can I uses Integrated Full Text Search (IFTS) with Filestream? Yes in exactly the same way as you can for normal varbinary (max) data in SQL Server 2005/8.  So you must have a unique identifier for each row and a separate column on each row to identify the type of data being stored (pdf, xlxs, docx, html and so on) for the ifilters to work.

Which editions of SQL Server does it work with? Filestream applies to SQL Server Express up to Enterprise edition. btw the filestream filegroup does not count as part of the 4Gb limit to the size of an Express database and these special filegroup can be as large as you like. 

For further reading go to TechNet’s Books on Line for SQL Server 2008