SQL Server 2005 XQuery Performance Tips

During some recent Xquery performance work, three discoveries emerged which dramatically improved the performance of my team's component.  I'd like to share them here..

Specify A Single Root Node

The XML parser in SQL2K5 doesn't assume that the XML it receives is well-formed.  It will therefore look for multiple occurrences of the root element in a well-formed XML document unless you tell it not to.  This syntax:

SELECT  @DomainName = @XmlDomain.value('/FilteredDomain[1]/@DomainName', 'nvarchar(255)')

..performs twice as fast as this syntax..

SELECT  @DomainName = @XmlDomain.value('/FilteredDomain/@DomainName', 'nvarchar(255)')

If you have schemas defined for these XML objects and you're using the schemas in retrieval, this will probably offer no performance enhancements – you'd get the same improvement the [1] above offers.

Note that you can apply the [1] to as many nodes as applicable for your XML.  This syntax is also valid:

SELECT  @DomainName = @XmlDomain.value('/FilteredDomain[1]/Domain[1]/@DomainName', 'nvarchar(255)')

Expand XML Paths In Aggregation/Evaluation

This is a simple syntactic tweak provided by Babu Krishnaswamy of the SQL2K5 dev team.  It offers about a 60% performance boost over our previous syntax by removing a JOIN in the retrieval plan generated by the SQL2K5 XQuery parser.  It's my belief that this benefit would accrue to an XML query in any tier using this syntax.

Old query:

SELECT  @InsertRows = @ParamWorkerXml.value('count(/FilteredDomain[1]/[@Action="I"])', 'int')

New query:

SELECT  @InsertRows = @ParamWorkerXml.value('count(/FilteredDomain[1]/@Action[.="I"])', 'int')

Avoid Applying In-Line Functions To XQuery Results

This is another of Babu's magic SQL Xquery tricks.  A 25% performance improvement was observed by applying this coding strategy.

Old query:

UPDATE  [DomainValue]
SET     [SortSequence]         = NULLIF(ISNULL(ref.value('@SortSequence','int'),[SortSequence]), -32000),
        [ParentDomainValueId]  = CASE ref.value('@ParentDomainValueId','int')
                                    WHEN  -32000 THEN NULL
                                    WHEN  NULL   THEN [ParentDomainValueId]
                                    ELSE  ref.value('@ParentDomainValueId','int')

        [IsVisible]            = ISNULL(ref.value('@IsVisible','bit'),[IsVisible]),
        [ActiveDate]           = NULLIF(ISNULL(ref.value('@ActiveDate','datetime'),[ActiveDate]), '1753-01-01 00:00:00.000'),
        [ExpirationDate]       = NULLIF(ISNULL(ref.value('@ExpirationDate','datetime'),[ExpirationDate]), '9999-12-31 00:00:00.000'),
        [ObjectStatusReasonId] = ISNULL(ref.value('@ObjectStatusReasonId','int'),[ObjectStatusReasonId]),
        [ObjectActiveFlag]     = ISNULL(ref.value('@ObjectActiveFlag','bit'),[ObjectActiveFlag]),
        [LastModifiedDate]     = GETUTCDATE(),
        [LastModifiedById]     = @ParamLastModifiedById,
        [LastModifierTypeId]   = @ParamLastModifierTypeId
FROM    [DomainValue]
JOIN    @ParamWorkerXml.nodes('/Domain[1]/DomainValue') AS node(ref)
ON      [DomainValue].[DomainId]        = ref.value('@DomainId', 'int')
AND     [DomainValue].[DomainValueId]   = ref.value('@DomainValueId', 'int')
AND     ref.value('@Action','nchar(1)') = N'U'
WHERE   [LastModifiedDate]     = CASE @ParamAllowDirtyWrite
                                        WHEN 0 THEN ref.value('@LastModifiedDate', 'datetime')
                                        ELSE [LastModifiedDate]

New query:

UPDATE  [dv]
SET     [SortSequence]         = NULLIF(ISNULL(InputXml.SortSequence,[dv].[SortSequence]), -32000),
        [ParentDomainValueId]  = CASE InputXml.ParentDomainValueId
                                    WHEN  -32000 THEN NULL
                                    WHEN  NULL   THEN [dv].[ParentDomainValueId]
                                    ELSE  InputXml.ParentDomainValueId
        [IsVisible]            = ISNULL(InputXml.IsVisible,[dv].[IsVisible]),
        [ActiveDate]           = NULLIF(ISNULL(InputXml.ActiveDate,[dv].[ActiveDate]), '1753-01-01 00:00:00.000'),
        [ExpirationDate]       = NULLIF(ISNULL(InputXml.ExpirationDate,[dv].[ExpirationDate]), '9999-12-31 00:00:00.000'),
        [ObjectStatusReasonId] = ISNULL(InputXml.ObjectStatusReasonId,[dv].[ObjectStatusReasonId]),
        [ObjectActiveFlag]     = ISNULL(InputXml.ObjectActiveFlag,[dv].[ObjectActiveFlag]),
        [LastModifiedDate]     = GETUTCDATE(),
        [LastModifiedById]     = @ParamLastModifiedById,
        [LastModifierTypeId]   = @ParamLastModifierTypeId
FROM    (
        SELECT  ref.value('@DomainId', 'int')               AS DomainId,
                ref.value('@DomainValueId', 'int')          AS DomainValueId,
                ref.value('@SortSequence','int')            AS SortSequence,
                ref.value('@ParentDomainValueId','int')     AS ParentDomainValueId,
                ref.value('@IsVisible','bit')               AS IsVisible,
                ref.value('@ActiveDate','datetime')         AS ActiveDate,
                ref.value('@ExpirationDate','datetime')     AS ExpirationDate,
                ref.value('@ObjectStatusReasonId','int')    AS ObjectStatusReasonId,
                ref.value('@ObjectActiveFlag','bit')        AS ObjectActiveFlag,
                ref.value('@LastModifiedDate', 'datetime')  AS LastModifiedDate
        FROM    @ParamWorkerXml.nodes('/Domain[1]/DomainValue')
        AS      node(ref)
        WHERE   ref.value('@Action','nchar(1)') = N'U'
        )   AS InputXml
JOIN    [dbo].[DomainValue] [dv]
ON      [dv].[DomainId]         = InputXml.DomainId
AND     [dv].[DomainValueId]    = InputXml.DomainValueId
AND     [dv].[LastModifiedDate] = CASE @ParamAllowDirtyWrite
                                        WHEN 0 THEN InputXml.LastModifiedDate
                                        ELSE [dv].[LastModifiedDate]

Comments (3)

  1. Anonymous says:

    @tvPrasad: thanks for your comment!

    What the third tip says is that if you're going to run intrinsic TSQL functions against a column in an XML result set, you'll get better performance if you shred the XML in a derived table and then run the same test against the derived table rather than the XML.

    Maybe it's a little clearer with fewer columns:


      UPDATE  [DomainValue]
      SET     [SortSequence]         = NULLIF(ISNULL(ref.value('@SortSequence','int'),[SortSequence]), -32000)
      FROM    [DomainValue]
      JOIN    @ParamWorkerXml.nodes('/Domain[1]/DomainValue') AS node(ref)


      UPDATE  [dv]
      SET     [SortSequence]         = NULLIF(ISNULL(InputXml.SortSequence,[dv].[SortSequence]), -32000)
      FROM    (
           SELECT  ref.value('@DomainId', 'int')               AS DomainId,
                   ref.value('@DomainValueId', 'int')          AS DomainValueId,
                   ref.value('@SortSequence','int')            AS SortSequence
           FROM    @ParamWorkerXml.nodes('/Domain[1]/DomainValue')
           AS      node(ref)
           )   AS InputXml
      JOIN    [dbo].[DomainValue] [dv]
      ON      [dv].[DomainId]         = InputXml.DomainId
      AND     [dv].[DomainValueId]    = InputXml.DomainValueId

    I hope this helps.  Please post again if you need more information.


  2. tvPrasad says:

    Thanks for sharing.

    I like tips 1 and 2.

    I am not sure I understand tip 3.

Skip to main content