Isolating problems that seem to be related to ISA Server – Part II

1. Introduction

In the last session, there was an issue that was related to the modem or router, although at the beginning of troubleshooting, the symptom seemed to be related to Microsoft® Internet Security and Acceleration (ISA) Server. Scenarios like that can take hours to narrow down and get to a point where ISA Server is completely isolated. However, in the last scenario, there was one key aspect that made a big difference in troubleshooting, which was the fact that the implementation never worked. It was a new implementation, and there were problems making it work.

In this new scenario, we show a situation where everything is working fine, and then it stops working as expected.

2. Scenario–ISA Server loses access to the Windows domain and stops authenticating users

In this scenario, the problem is that domain users cannot access the Internet anymore. The environment was working, and suddenly users start to receive pop-up windows asking for authentication while browsing external Web sites. The authentication request is appearing for all users, and even when users type the correct credentials, the authentication window appears.

This is the behavior from the user's perspective. The network administrator also encounters an error when trying to open the firewall policy. The following error message appears:

"The trust relationship between this workstation and the primary domain failed. 0x800706fd"

We confirm that the ISA Server computer is a member of a Microsoft Windows domain. Based on the error, it appears that the domain controller cannot be contacted. Although things are becoming clear based on this message and the behavior, we don't know exactly what is happening or why.

The following figure shows the basic network topology, including where the ISA Server computer is located, according to the customer.

Network Topology 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Figure 1

3. Collecting data

Because the issue is related to the communication between the ISA Server computer and the domain controller, we start the tests to validate the proper communication between these computers.

Commands executed from ISA Server

Result

Notes

Ping to the domain controllers

Answers received without any packets being dropped.

None.

netdiag /v

Based on the result of this test, we observe many things, but the most important one is this report:

DC list test . . . . . . . . . . . : Failed

[WARNING] Cannot call DsBind to clusterdc.ctest.com

(192.168.0.10). [RPC_S_CALL_FAILED]

Trust relationship test. . . . . . : Failed

Test to ensure DomainSid of domain 'CTEST' is correct.

[FATAL] Secure channel to domain 'CTEST' is broken.

[ERROR_NO_LOGON_SERVERS]

The security channel test couldn't be done correctly. There are issues communicating using a remote procedure call (RPC).

nltest /sc_verify:ctest

Flags: 80

Trusted DC Name

Trusted DC Connection Status Status = 1311 0x51f ERROR_NO_LOGON_SERVERS

Trust Verification Status = 1311 0x51f ERROR_NO_LOGON_SERVERS

The command completed successfully

We executed this test to determine if the security channel is broken.

Using those commands, it appears that the security channel is broken. We try to reset the security channel using the NLTEST command, but communication between the ISA Server computer and the domain controller is not occurring.

Moving further, we decide to get Network Monitor traces to understand where the RPC communication is failing.

4. Analyzing data

Again, the Network Monitor trace is a key element to determine the root cause of the issue. The following happens:

· We bind Network Monitor on the ISA Server internal network interface card (NIC) and run the command nltest /sc_verify:ctest again. It is possible to see that the TCP three-way handshake is occurring just fine, and the endpoint mapper ports are negotiated:

2788 15:24:32.530 192.168.0.8 clusterdc.ctest.com TCP TCP: Flags=.S...... , SrcPort=1571, DstPort=DCE endpoint resolution(135), Len=0, Seq=853700869, Ack=0, Win=16384 (scale factor not found)

2789 15:24:32.530 clusterdc.ctest.com 192.168.0.8 TCP TCP: Flags=.S..A... , SrcPort=DCE endpoint resolution(135), DstPort=1571, Len=0, Seq=3785297540, Ack=853700870, Win=16384 (scale factor not found)

2790 15:24:32.530 192.168.0.8 clusterdc.ctest.com TCP TCP: Flags=....A..., SrcPort=1571, DstPort=DCE endpoint resolution(135) , Len=0, Seq=853700870, Ack=3785297541, Win=17520 (scale factor not found)

· After that, ISA Server tries to bind the higher ports:

2791 15:24:32.530 192.168.0.8 clusterdc.ctest.com MSRPC MSRPC: c/o Bind: UUID{E1AF8308-5D1F-11C9-91A4-08002B14A0FA} Endpoint Mapper Call=0x1 Assoc Grp=0x0 Xmit=0x16D0 Recv=0x16D0

+ Tcp: Flags=...PA..., SrcPort=1571, DstPort=DCE endpoint resolution(135), Len=72, Seq=853700870 - 853700942, Ack=3785297541, Win=17520 (scale factor not found)

- RPC: c/o Bind: UUID{E1AF8308-5D1F-11C9-91A4-08002B14A0FA} Endpoint Mapper Call=0x1 Assoc Grp=0x0 Xmit=0x16D0 Recv=0x16D0

  + Bind: {E1AF8308-5D1F-11C9-91A4-08002B14A0FA} Endpoint Mapper

· The packet that ISA Server sends is not answered by the domain controller. ISA Server retransmits the packet, and after three seconds, the domain controller sends a TCP Reset command:

2831 15:24:35.155 clusterdc.ctest.com 192.168.0.8 TCP TCP: Flags=..R....., SrcPort=DCE endpoint resolution(135), DstPort=1571, Len=0, Seq=3785297601, Ack=3785297601, Win=0 (scale factor not found)

- Tcp: Flags=..R....., SrcPort=DCE endpoint resolution(135), DstPort=1571, Len=0, Seq=3785297601, Ack=3785297601, Win=0 (scale factor not found)

    SrcPort: DCE endpoint resolution(135)

    DstPort: 1571

    SequenceNumber: 3785297601 (0xE19F0EC1)

    AcknowledgementNumber: 3785297601 (0xE19F0EC1)

  + DataOffset: 80 (0x50)

  - Flags: ..R.....

     CWR: (0.......) CWR not significant

     ECE: (.0......) ECN-Echo not significant

     Urgent: (..0.....) Not Urgent Data

     Ack: (...0....) Acknowledgement field not significant

     Push: (....0...) No Push Function

     Reset: (.....1..) Reset

     Syn: (......0.) Not Synchronize sequence numbers

     Fin: (.......0) Not End of data

    Window: 0 (scale factor not found)

    Checksum: 34874 (0x883A)

    UrgentPointer: 0 (0x0)

· At this point, the following error message appears in the ISA Server Command Prompt window:

Flags: 80

Trusted DC Name

Trusted DC Connection Status Status = 1311 0x51f ERROR_NO_LOGON_SERVERS

Trust Verification Status = 1311 0x51f ERROR_NO_LOGON_SERVERS

The command completed successfully

5. Conclusion

After this test, we conclude that there are issues in binding higher ports to the domain controller. The domain functionality is working on the Internal network. In another words, RPC communication is working internally, Microsoft Exchange Server is working, logon functionality is working, and the environment is stable internally.

According to the customer, the environment was working before, and the infrastructure did not change. The customer says that the ports on the third-party firewall that connect to the corporate network are opening. We test using PortQuery Scan, and verify that the ports are open.

The firewall administrator is contacted, and after troubleshooting, it is determined that one feature on the firewall performs a kind of intrusion-detection analysis on the traffic. This feature is dropping the RPC endpoint mapper traffic, which causes the RPC communication to fail. According to the administrator, this feature was enabled a few days before the problem started to happen. Because it was not documented, the administrator didn't know that this is the root cause of the problem.

After disabling this feature, everything returns to normal, and users are able to browse the Internet without problems.

Yuri Diogenes

Support Engineer – Latin America Team – Platforms

Microsoft