In the last session, there was an issue that was related to the modem or router, although at the beginning of troubleshooting, the symptom seemed to be related to Microsoft® Internet Security and Acceleration (ISA) Server. Scenarios like that can take hours to narrow down and get to a point where ISA Server is completely isolated. However, in the last scenario, there was one key aspect that made a big difference in troubleshooting, which was the fact that the implementation never worked. It was a new implementation, and there were problems making it work.
In this new scenario, we show a situation where everything is working fine, and then it stops working as expected.
2. Scenario–ISA Server loses access to the Windows domain and stops authenticating users
In this scenario, the problem is that domain users cannot access the Internet anymore. The environment was working, and suddenly users start to receive pop-up windows asking for authentication while browsing external Web sites. The authentication request is appearing for all users, and even when users type the correct credentials, the authentication window appears.
This is the behavior from the user’s perspective. The network administrator also encounters an error when trying to open the firewall policy. The following error message appears:
We confirm that the ISA Server computer is a member of a Microsoft Windows domain. Based on the error, it appears that the domain controller cannot be contacted. Although things are becoming clear based on this message and the behavior, we don’t know exactly what is happening or why.
The following figure shows the basic network topology, including where the ISA Server computer is located, according to the customer.
3. Collecting data
Because the issue is related to the communication between the ISA Server computer and the domain controller, we start the tests to validate the proper communication between these computers.
Commands executed from ISA Server
Ping to the domain controllers
Answers received without any packets being dropped.
Based on the result of this test, we observe many things, but the most important one is this report:
DC list test . . . . . . . . . . . : Failed
[WARNING] Cannot call DsBind to clusterdc.ctest.com
Trust relationship test. . . . . . : Failed
Test to ensure DomainSid of domain ‘CTEST’ is correct.
[FATAL] Secure channel to domain ‘CTEST’ is broken.
The security channel test couldn’t be done correctly. There are issues communicating using a remote procedure call (RPC).
Trusted DC Name
Trusted DC Connection Status Status = 1311 0x51f ERROR_NO_LOGON_SERVERS
Trust Verification Status = 1311 0x51f ERROR_NO_LOGON_SERVERS
The command completed successfully
We executed this test to determine if the security channel is broken.
Using those commands, it appears that the security channel is broken. We try to reset the security channel using the NLTEST command, but communication between the ISA Server computer and the domain controller is not occurring.
Moving further, we decide to get Network Monitor traces to understand where the RPC communication is failing.
4. Analyzing data
Again, the Network Monitor trace is a key element to determine the root cause of the issue. The following happens:
· We bind Network Monitor on the ISA Server internal network interface card (NIC) and run the command nltest /sc_verify:ctest again. It is possible to see that the TCP three-way handshake is occurring just fine, and the endpoint mapper ports are negotiated:
· After that, ISA Server tries to bind the higher ports:
· The packet that ISA Server sends is not answered by the domain controller. ISA Server retransmits the packet, and after three seconds, the domain controller sends a TCP Reset command:
· At this point, the following error message appears in the ISA Server Command Prompt window:
After this test, we conclude that there are issues in binding higher ports to the domain controller. The domain functionality is working on the Internal network. In another words, RPC communication is working internally, Microsoft Exchange Server is working, logon functionality is working, and the environment is stable internally.
According to the customer, the environment was working before, and the infrastructure did not change. The customer says that the ports on the third-party firewall that connect to the corporate network are opening. We test using PortQuery Scan, and verify that the ports are open.
The firewall administrator is contacted, and after troubleshooting, it is determined that one feature on the firewall performs a kind of intrusion-detection analysis on the traffic. This feature is dropping the RPC endpoint mapper traffic, which causes the RPC communication to fail. According to the administrator, this feature was enabled a few days before the problem started to happen. Because it was not documented, the administrator didn’t know that this is the root cause of the problem.
After disabling this feature, everything returns to normal, and users are able to browse the Internet without problems.
Support Engineer –