O’DFS Shares! Where Art Thou? – Part 2/3

Hello, this is Sabin and Shravan from the Microsoft Directory Services Support team once again. We will continue our discussion regarding slowness/delay experienced by clients in accessing a DFS Namespace. In Part 1 of this blog, we reviewed the referral process for Domain Based Namespaces and took a closer look at a working scenario where clients were able to access the DFS Shares without delays.

Here in Part 2 of this blog for troubleshooting slow access of DFS Shares, we will review a scenario where a user is seeing a delay while trying to access a DFS share (\\contoso.com\rootdfsn\data) where both DFS Servers –MS1 and MS2 are in the same site as the client.

DFS SETUP:

image

 

STEPS TAKEN:

We reproduced the issue with slow DFS access and ran the following tools:

Entry: \Ms2\rootdfsn\Data
ShortEntry: \Ms2\rootdfsn\Data
Expires in 1711 seconds
UseCount: 0 Type:0x1 ( DFS )
0:[\Ms2\DataReplica] State:0x110 ( ACTIVE TARGETSET )
1:[\Ms1\Data] State:0x00

Entry: \contoso.com\rootdfsn
ShortEntry: \contoso.com\rootdfsn
Expires in 208 seconds
UseCount: 0 Type:0x81 ( REFERRAL_SVC DFS )
0:[\Ms1\rootdfsn] State:0x100 ( TARGETSET )
1:[\Ms2\rootdfsn] State:0x10 ( ACTIVE)
DfsUtil command completed successfully.

Analysis:

1. In this case, as you can see the root referral (0x81) shows a TARGETSET of two servers in the referral list ordered as –

a. MS1
b. MS2

2. As we can see, MS2 is set to ACTIVE. Ideally, the client should have chosen the first root target (MS1) in the domain-based root referral, connect to the root server and navigate the subfolders under the root folder. On encountering a link folder, the root server should send a status message to the client, indicating that this is a link folder that requires redirection.

3. If the link referral is not in the cache, the Vista client should have connected to the IPC$ named pipe of the root server in the user’s context (or in the context of the LocalSystem account for pre-Vista clients) and request a link referral from the root server which in turn returns a list of link targets.

4. However, as we can see, MS2 (second server in the list) is set as the ACTIVE server since the client was unable to connect and/or request a link referral from the MS1- first server in the targetset.

Note: It’s important to call out here that the client spent some time trying to traverse the root referral list and finally get to the link target. This can contribute to the delay in accessing a DFS share.

5. Finally, as we can see in the referral cache, the client gets a link referral from MS2 and eventually accesses the target on MS2.

Question:

Why was the client not able to access MS1 DFS root?

Step 2: The following is an excerpt from dfsdiag /testdfsconfig /dfsroot:\\contoso.com\rootdfsn

Starting TestDfsConfig ....
Retrieving All the Root Targets ....
Validating DFS Service ....

Validating DFS Service on MS1.
DFSDIAG_ERROR - SYS - The RPC server is unavailable.

Validating DFS Service on MS2.
DFSDIAG_INFO - APPL - DFS Service on MS2 is OK.

Validating Registry Entries ....
DFSDIAG_ERROR - SYS - The network path was not found.
DFSDIAG_WARNING - APPL - MS1's Registry not accessible;Ignoring this for comparison. DFSDIAG_INFO - APPL - Not a single comparison occurred due to errors.

Finished TestDfsConfig.

Analysis:

As evident from above, the DFS service on MS1 doesn’t seem to be running and/or we are not able to bind to this box due to RPC errors.

Step 3: Run dfsdiag /testdfsintegrity /dfsroot:\\contoso.com\rootdfsn /full

Starting TestDfsIntegrity ....

DFSDIAG_ERROR - SYS - The RPC server is unavailable.
DFSDIAG_ERROR - SYS - The specified domain either does not exist or could not be contacted. DFSDIAG_WARNING - APPL - Unable to access dfs metadata of \\MS1\rootdfsn.

Finished TestDfsIntegrity.

Analysis:

More evidence on accessing/binding issues with MS1 root server

Note: Alternatively, we could run dfsdiag /testreferral /dfspath:\\contoso.com\rootdfsn\data /full which collectively runs all the above tests.

Step 4: Try to access the DFS share on MS1 using NetBIOS and FQDN

\\ms1\data - Error: The specified network name is no longer available
\\ms1.contoso.com\data - Error: The network path was not found

Note: Please make sure you are able to ping ms1.contoso.com before doing the above steps.

Step 5: Run a Netmon trace (network capture) while trying to access the DFS share from client

The following is an excerpt from Netmon for the same setup. Not all frames are shown:

18 VistaDFSClient DC2 DFS DFS:Get DFS Referral Request, FileName: \contoso.com\rootdfsn, MaxReferralLevel: 4
19 DC2 VistaDFSClient DFS DFS:Get DFS Referral Response, NumberOfReferrals: 2 VersionNumber: 4

NumberOfReferrals: 2 (0x2)
DfsPath: \contoso.com\rootdfsn
DfsAlternatePath: \contoso.com\rootdfsn
TargetPath: Index:1 \Ms1\rootdfsn
TargetPath: Index:2 \Ms2\rootdfsn

20 VistaDFSClient MS1 TCP TCP:Flags=......S., SrcPort=51519, DstPort=Microsoft-DS(445), PayloadLen=0, Seq=15551468, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192
21 MS1 VistaDFSClient TCP TCP:Flags=...A..S., SrcPort=Microsoft-DS(445), DstPort=51519, PayloadLen=0, Seq=2527195883, Ack=15551469, Win=16384 ( Negotiated scale factor 0x0 ) = 16384
22 VistaDFSClient MS1 TCP TCP:Flags=...A...., SrcPort=51519, DstPort=Microsoft-DS(445), PayloadLen=0, Seq=15551469, Ack=2527195884, Win=513 (scale factor 0x8) = 131328
23 MS1 VistaDFSClient TCP TCP:Flags=.....R.., SrcPort=Microsoft-DS(445), DstPort=51519, PayloadLen=0, Seq=2527195884, Ack=2527195884, Win=0 (scale factor 0x0) = 0
24 VistaDFSClient MS2 TCP TCP:Flags=......S., SrcPort=51520, DstPort=Microsoft-DS(445), PayloadLen=0, Seq=514747392, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 819231 VistaDFSClient MS2 SMB SMB:C; Tree Connect Andx, Path = \\MS2\ROOTDFSN, Service = ?????
25 MS2 VistaDFSClient TCP TCP:Flags=...A..S., SrcPort=Microsoft-DS(445), DstPort=51520, PayloadLen=0, Seq=1097579383, Ack=514747393, Win=16384 ( Negotiated scale factor 0x0 ) = 16384
26 VistaDFSClient MS2 TCP TCP:Flags=...A...., SrcPort=51520, DstPort=Microsoft-DS(445), PayloadLen=0, Seq=514747393, Ack=1097579384, Win=513 (scale factor 0x8) = 131328
27 VistaDFSClient MS2 SMB SMB:C; Negotiate, Dialect = PC NETWORK PROGRAM 1.0, LANMAN1.0, Windows for Workgroups 3.1a, LM1.2X002, LANMAN2.1, NT LM 0.12, SMB 2.002
28 MS2 VistaDFSClient SMB SMB:R; Negotiate, Dialect is NT LM 0.12 (#5), SpnegoNegTokenInit
29 VistaDFSClient MS2 SMB SMB:C; Session Setup Andx, Krb5ApReq (0x100)
30 MS2 VistaDFSClient SMB SMB:R; Session Setup Andx, Krb5ApRep (0x200)
31 VistaDFSClient MS2 SMB SMB:C; Tree Connect Andx, Path = \\MS2\ROOTDFSN, Service = ?????
32 MS2 VistaDFSClient SMB SMB:R; Tree Connect Andx, Service = A:
33 VistaDFSClient MS2 SMB SMB:C; Transact2, Query Path Info, Query File Basic Info \contoso.com\rootdfsn
104 VistaDFSClient MS2 SMB SMB:C; Transact2, Query Path Info, Query File Basic Info, Pattern = \contoso.com\rootdfsn\Data
106 MS2 VistaDFSClient SMB SMB:R; Transact2, Query Path Info - NT Status: System - Error, Code = (599) STATUS_PATH_NOT_COVERED
107 VistaDFSClient MS2 SMB SMB:C; Tree Connect Andx, Path = \\MS2\IPC$, Service = ?????
108 MS2 VistaDFSClient SMB SMB:R; Tree Connect Andx, Service = IPC
109 VistaDFSClient MS2 DFS DFS:Get DFS Referral Request, FileName: \Ms2\rootdfsn\Data, MaxReferralLevel: 4
110 MS2 VistaDFSClient DFS DFS:Get DFS Referral Response, NumberOfReferrals: 2 VersionNumber: 4

NumberOfReferrals: 2 (0x2)
+ ReferralV4: Index:1 TTL:1800 Seconds
+ ReferralV4: Index:2 TTL:1800 Seconds
DfsPath: \Ms2\rootdfsn\Data
DfsAlternatePath: \Ms2\rootdfsn\Data
TargetPath: Index:1 \Ms2\DataReplica
TargetPath: Index:2 \Ms1\Data

111 VistaDFSClient MS2 SMB SMB:C; Tree Connect Andx, Path = \\MS2\DATAREPLICA, Service = ?????
112 MS2 VistaDFSClient SMB SMB:R; Tree Connect Andx, Service = A:
113 VistaDFSClient MS2 SMB SMB:C; Transact2, Query Path Info, Query File Basic Info, Pattern =
114 MS2 VistaDFSClient SMB SMB:R; Transact2, Query Path Info, Query File Basic Info
115 VistaDFSClient MS2 SMB SMB:C; Transact2, Query Path Info, Query File Standard Info, Pattern =
116 MS2 VistaDFSClient SMB SMB:R; Transact2, Query Path Info, Query File Standard Info
118 VistaDFSClient MS2 SMB SMB:C; Nt Create Andx, FileName =
119 MS2 VistaDFSClient SMB SMB:R; Nt Create Andx, FID = 0x4002 (NULL@#118)
120 VistaDFSClient MS2 SMB SMB:C; Transact2, Query File Info, Query File Internal Info, FID = 0x4002 (NULL@#118)
121 MS2 VistaDFSClient SMB SMB:R; Transact2, Query File Info, Query File Internal Info, FID = 0x4002 (NULL@#118)
122 VistaDFSClient MS2 SMB SMB:C; Transact2, Query File Info, Query File Standard Info, FID = 0x4002 (NULL@#118)
123 MS2 VistaDFSClient SMB SMB:R; Transact2, Query File Info, Query File Standard Info, FID = 0x4002 (NULL@#118)
124 VistaDFSClient MS2 SMB SMB:C; Transact2, Find First2, Both Directory Info (NT), Pattern = \*
142 VistaDFSClient MS2 SMB SMB:C; Transact2, Query Path Info, Query File Basic Info, Pattern = \File1.txt
143 MS2 VistaDFSClient SMB SMB:R; Transact2, Query Path Info, Query File Basic Info
144 VistaDFSClient MS2 SMB SMB:C; Transact2, Query Path Info, Query File Standard Info, Pattern = \File1.txt
145 MS2 VistaDFSClient SMB SMB:R; Transact2, Query Path Info, Query File Standard Info
147 VistaDFSClient MS2 SMB SMB:C; Transact2, Find First2, Both Directory Info (NT), Pattern = \contoso.com\rootdfsn\Data
148 MS2 VistaDFSClient SMB SMB:R; Transact2, Find First2, Both Directory Info (NT)

ANALYSIS:

1. Packets 18-19 – Shows the DFS referral request from Vista client to domain controller DC2. Then DC2 comes back with a referral response and provides a list in the following order:

TargetPath: Index:1 \Ms1\rootdfsn
TargetPath: Index:2 \Ms2\rootdfsn

NOTE: The referral order is important. As we can see, MS1 is the first server in the list followed by MS2.

2. Packets 20-30 - Shows the Vista client trying to access MS1 (first server in the list) and eventually failing over to MS2.

NOTE: In frame 23 you can see MS1 sends a Reset to client before it fails over to MS2. You may not always see as this particular problem is specific to the lab environment and might differ from the network captures taken in your environment.

3. Packets 31-108 – Shows the client trying to connect to the ipc$ of root (MS2) and navigating shares before getting a STATUS_PATH_NOT_COVERED response (expected) from server indicating the share is a “link folder” as against normal share and requires a link referral.

4. Packets 109-100 – Shows a link referral request from client and MS2 responds with the following referral order:

TargetPath: Index:1 \Ms2\DataReplica
TargetPath: Index:2 \Ms1\Data

NOTE: As you can see, MS2 is the first in the list. This is not always the case. Sometimes, the client will be pointed to MS1 again and that could lead to more delay if client is not able to access the shares and fail over to MS2.

5. Packets 111-148 – Shows the client accessing the \\MS2\datareplica (target replica for data on MS1) and the user navigates/accesses the files (file.txt) as needed.

Netmon 3.3 Tip: You can parse the network captures using a combination of the following filters:

SMB
DFS
IPV4.address == a.b.c.d && IPV4.address == w.x.y.z

{Where a.b.c.d and w.x.y.z are DFS client/Server IP addresses}

RESOLUTION:

Based on the above steps, we can clearly see that the client is unable to access MS1 via DFS and NetBIOS/FQDN. The traces also indicate the failure and eventual failover to MS2 server causing the delay. When we checked the services on MS1, we found that the DFS and Server services had stopped/disabled. Once we re-started the services, normal operation resumed and client was able to access the MS1 for DFS services.

Other ways to reproduce the problem:

- Turning off MS1 (Note: There is a 50% chance to reproduce the problem using this method since the DC’s will randomly order the DFS referral within a targetset. If MS2 was listed first, the client may not experience the delay as it never attempts reaching MS1)

or

- Configuring Target priority of the servers to list MS1 before MS2 and turning off MS1.

That’s it for this time. See you next time where we will go into troubleshooting access to DFS shares where clients access DFS data in remote sites resulting in slow responses.

-Sabin Nair and Shravan Kumar