Single Instance Storage (SIS) in Windows Storage Server 2003 R2 is starting to generate some buzz on other blogs. A blog reader asked us to explain SIS in more detail. Over the next couple days, I’ll publish a series of 4 posts covering SIS basics and design. Many thanks to Claus Joergensen for providing this information.
SIS is a storage feature available on Microsoft® Windows Storage Server™ 2003 R2. Single Instance Storage (SIS) recovers disk space by reducing the amount of redundant data stored on a volume by identifying identical files, storing only a single copy of the file in the SIS common store and replacing the files with links to the file in the SIS common store. Consider the following scenario:
1. Two users receive the same e-mail with an attachment. They both save the attachment to their home folder. SIS runs in the background and detects the two identical files on the volume and moves one of the copies into the SIS common store and replaces both files with a link to the file in the SIS common store.
2. One of the users makes a change to the file. SIS immediately detects that an update is pending for the file removes the link in the users home folder and replaces it with a copy of the file in the SIS common store. The updates are then applied to a fresh copy of the original file. This is completely transparent to the application.
3. The other user’s file remains in the SIS common store with a link in the user’s home folder, even if there is only one link to the file. When the second user updates the file (assuming there are no other links) the link is deleted and replaced with a copy of the original file and the file in the SIS common store is deleted.
SIS is similar to the “symbolic link” feature implemented in UNIX and other operating systems. However, SIS differs from symbolic links in three fundamental ways:
1. If a user has two files sharing disk storage by using SIS and someone modifies one of the files, users of the other files do not see the changes. The SIS link can be thought of as an automatic “copy-on-write” link. The two files are “linked” only as long as they are identical. In contrast, with symbolic links changes made through one of the links change the content of all links to the file.
2. The underlying shared disk storage that backs SIS links is maintained by the system and is only deleted if all the SIS links pointing to it are deleted. In contrast, symbolic links can “break” if a user deletes the target file.
3. SIS works automatically without any user involvement, in contrast to symbolic links that must be set up and maintained by the user. SIS automatically determines that two or more files have the same content and links them together.
Benefits of SIS
SIS provides the following benefits:
- SIS reduces disk space consumption by eliminating duplicate files–our own IT department (MS-IT) saved 14 TB (40%) on servers hosting MS products
- SIS is a set it and forget it feature. It does not require daily maintenance.
- SIS is transparent to the end users and applications
- SIS provides a backup API that allows backup apps to determine if file is SIS’d and only back up the file once (most major Backup application vendors support SIS)
Requirements for Using SIS
- SIS can be enabled on a per-volume basis on up to 6 volumes on a server running Windows Storage Server 2003 R2
- SIS evaluates files >32 KB by default
- SIS can be used only on local NTFS volumes
- SIS cannot be used on the system or boot volume or on remote drives
More to come on SIS design. Stay tuned!