SQL Server 2005 XQuery Performance Tips

During some recent Xquery performance work, three discoveries emerged which dramatically improved the performance of my team's component. I'd like to share them here..

Specify A Single Root Node

The XML parser in SQL2K5 doesn't assume that the XML it receives is well-formed. It will therefore look for multiple occurrences of the root element in a well-formed XML document unless you tell it not to. This syntax:

SELECT  @DomainName = @XmlDomain.value('/FilteredDomain [1] /@DomainName', 'nvarchar(255)')

..performs twice as fast as this syntax..

SELECT  @DomainName = @XmlDomain.value('/FilteredDomain/@DomainName', 'nvarchar(255)')

If you have schemas defined for these XML objects and you're using the schemas in retrieval, this will probably offer no performance enhancements – you'd get the same improvement the [1] above offers.

Note that you can apply the [1] to as many nodes as applicable for your XML. This syntax is also valid:

SELECT  @DomainName = @XmlDomain.value('/FilteredDomain [1] /Domain [1] /@DomainName', 'nvarchar(255)')

Expand XML Paths In Aggregation/Evaluation

This is a simple syntactic tweak provided by Babu Krishnaswamy of the SQL2K5 dev team. It offers about a 60% performance boost over our previous syntax by removing a JOIN in the retrieval plan generated by the SQL2K5 XQuery parser. It's my belief that this benefit would accrue to an XML query in any tier using this syntax.

Old query:

SELECT  @InsertRows = @ParamWorkerXml.value('count(/FilteredDomain[1] /[@Action="I"] )', 'int')

New query:

SELECT  @InsertRows = @ParamWorkerXml.value('count(/FilteredDomain[1] /@Action[.="I"] )', 'int')

Avoid Applying In-Line Functions To XQuery Results

This is another of Babu's magic SQL Xquery tricks. A 25% performance improvement was observed by applying this coding strategy.

Old query:

UPDATE  [DomainValue]
SET     [SortSequence] = NULLIF(ISNULL(ref.value('@SortSequence','int'),[SortSequence]), -32000),
        [ParentDomainValueId] = CASE ref.value('@ParentDomainValueId','int')
                                    WHEN -32000 THEN NULL
                                    WHEN NULL THEN [ParentDomainValueId]
                                    ELSE ref.value('@ParentDomainValueId','int')
                                    END,

        [IsVisible] = ISNULL(ref.value('@IsVisible','bit'),[IsVisible]),
        [ActiveDate] = NULLIF(ISNULL(ref.value('@ActiveDate','datetime'),[ActiveDate]), '1753-01-01 00:00:00.000'),
        [ExpirationDate] = NULLIF(ISNULL(ref.value('@ExpirationDate','datetime'),[ExpirationDate]), '9999-12-31 00:00:00.000'),
        [ObjectStatusReasonId] = ISNULL(ref.value('@ObjectStatusReasonId','int'),[ObjectStatusReasonId]),
        [ObjectActiveFlag] = ISNULL(ref.value('@ObjectActiveFlag','bit'),[ObjectActiveFlag]),
        [LastModifiedDate] = GETUTCDATE(),
        [LastModifiedById] = @ParamLastModifiedById,
        [LastModifierTypeId] = @ParamLastModifierTypeId
FROM    [DomainValue]
JOIN    @ParamWorkerXml.nodes('/Domain[1]/DomainValue') AS node(ref)
ON      [DomainValue].[DomainId] = ref.value('@DomainId', 'int')
AND     [DomainValue].[DomainValueId] = ref.value('@DomainValueId', 'int')
AND     ref.value('@Action','nchar(1)') = N'U'
WHERE   [LastModifiedDate] = CASE @ParamAllowDirtyWrite
                                        WHEN 0 THEN ref.value('@LastModifiedDate', 'datetime')
                                        ELSE [LastModifiedDate]
                                 END

New query:

UPDATE  [dv]
SET     [SortSequence] = NULLIF(ISNULL(InputXml.SortSequence,[dv].[SortSequence]), -32000),
        [ParentDomainValueId] = CASE InputXml.ParentDomainValueId
                                    WHEN -32000 THEN NULL
                                    WHEN NULL THEN [dv].[ParentDomainValueId]
                                    ELSE InputXml.ParentDomainValueId
                                    END,
        [IsVisible] = ISNULL(InputXml.IsVisible,[dv].[IsVisible]),
        [ActiveDate] = NULLIF(ISNULL(InputXml.ActiveDate,[dv].[ActiveDate]), '1753-01-01 00:00:00.000'),
        [ExpirationDate] = NULLIF(ISNULL(InputXml.ExpirationDate,[dv].[ExpirationDate]), '9999-12-31 00:00:00.000'),
        [ObjectStatusReasonId] = ISNULL(InputXml.ObjectStatusReasonId,[dv].[ObjectStatusReasonId]),
        [ObjectActiveFlag] = ISNULL(InputXml.ObjectActiveFlag,[dv].[ObjectActiveFlag]),
        [LastModifiedDate] = GETUTCDATE(),
        [LastModifiedById] = @ParamLastModifiedById,
        [LastModifierTypeId] = @ParamLastModifierTypeId
FROM    (
        SELECT ref.value('@DomainId', 'int') AS DomainId,
                ref.value('@DomainValueId', 'int') AS DomainValueId,
                ref.value('@SortSequence','int') AS SortSequence,
                ref.value('@ParentDomainValueId','int') AS ParentDomainValueId,
                ref.value('@IsVisible','bit') AS IsVisible,
                ref.value('@ActiveDate','datetime') AS ActiveDate,
                ref.value('@ExpirationDate','datetime') AS ExpirationDate,
                ref.value('@ObjectStatusReasonId','int') AS ObjectStatusReasonId,
                ref.value('@ObjectActiveFlag','bit') AS ObjectActiveFlag,
                ref.value('@LastModifiedDate', 'datetime') AS LastModifiedDate
        FROM @ParamWorkerXml.nodes('/Domain[1]/DomainValue')
        AS node(ref)
        WHERE ref.value('@Action','nchar(1)') = N'U'
        ) AS InputXml
JOIN    [dbo].[DomainValue] [dv]
ON      [dv].[DomainId] = InputXml.DomainId
AND     [dv].[DomainValueId] = InputXml.DomainValueId
AND     [dv].[LastModifiedDate] = CASE @ParamAllowDirtyWrite
                                        WHEN 0 THEN InputXml.LastModifiedDate
                                        ELSE [dv].[LastModifiedDate]
                                  END