Time Matters - When ISA Server is affected by Windows Time settings

1. Introduction

 

As the title of this post suggests, this is all about time and keeping systems in sync. Many administrators think that time just matters if Kerberos is somehow involved in the deployment, which is not true. This post will describe two scenarios where ISA Server was having problems performing the expected operation due Windows Time settings.

 

2. Scenario 1 – Failing to Synchronize with CSS - ISA Server in workgroup

 

When the issue is related to synchronization with CSS in workgroup, one of the primary thing to be concerned about is the certificate. Recently, we released a new article on Tales from the Edge about troubleshooting primary CSS down scenario and it is very worth to go through that troubleshooting flow.

 

For this specific scenario the problem was as follows:

- ISA Administrator first identified that changes that were done in the firewall policy were not taking effect. He followed the troubleshooting approach from this article here and fixed the issue.

- After fixing this problem, ISA Administrator notices the following screen when he was trying to synchronize the array after changing a rule:

 

 

Figure 1 – Unable to sync with CSS.

- Apparently the CSS (SRVISAWGN1) was able to synchronize with itself. This means that the new server certificate that was installed on this server was working properly.

- Notice that the last update that SRVISAWGN2 received was on 09/01.

- As a troubleshooting measure, the ISA administrator went to the SRVISAWGN2 server and tried to connect to the CSS using LDP on port 2172 with SSL and here the result from LDP window:

Figure 2- Unable to establish a connection with CSS using SSL.

During this connection attempt a netmon capture was performed. Since this particular issue started after a new certificate was imported on CSS, the first step reading the netmon trace is verify if the SSL handshake is working correctly. Here it is the traffic:

- SRVISAWGN2 sends the initial SSL handshake:

10.20.20.200 10.20.20.201 SSL SSL:SSLv2RecordLayer, ClientHello(0x01)

- SRVISAWGN1 sends SSL Server Hello with the certificate:

10.20.20.201 10.20.20.200 SSL SSL: Server Hello. Certificate.

- Ssl: Server Hello. Certificate.

- TlsRecordLayer:

ContentType: HandShake

+ Version: TLS 1.0

Length: 4703 (0x125F)

- SSLHandshake: SSL HandShake TLS 1.0 Certificate(0x0B)

HandShakeType: ServerHello(0x02)

Length: 70 (0x46)

- ServerHello: 0x1

+ Version: TLS 1.0

+ RandomBytes:

SessionIDLength: 32 (0x20)

SessionID: Binary Large Object (32 Bytes)

CipherSuite: SSL_RSA_WITH_RC4_128_MD5 { 0x00,0x04 }

CompressionMethods: 0 (0x0)

HandShakeType: Certificate(0x0B)

Length: 1384 (0x568)

- Cert: 0x1

CertOffset: 1381 (0x565)

- Certificates:

CertificateLength: 1378 (0x562)

- X509Cert: Issuer: Fabrikam CA,fabrikam,msft, Subject: SRVISAWGN1,KELLER,TEXAS

+ SequenceHeader:

- TbsCertificate: Issuer: Fabrikam CA,fabrikam,msft, Subject: SRVISAWGN1,KELLER,TEXAS

+ SequenceHeader:

+ Tag0:

+ Version: v3 (2)

+ SerialNumber: 0x611ab2d3000000000003

+ Signature: Sha1WithRSAEncryption (1.2.840.113549.1.1.5)

+ Issuer: Fabrikam CA,fabrikam,msft

+ Validity: From: 09/02/09 18:30:12 UTC To: 09/02/10 18:40:12 UTC

+ Subject: SRVISAWGN1,KELLER,TEXAS

+ SubjectPublicKeyInfo: RsaEncryption (1.2.840.113549.1.1.1)

+ Tag3:

+ Extensions:

+ SignatureAlgorithm: Sha1WithRSAEncryption (1.2.840.113549.1.1.5)

+ Signature:

Certificates:

Note: notice that the certificate was issued on 09/02/2009.

- SRVISAWGN2 sends a TCP FIN right away.

10.20.20.200 10.20.20.201 TCP TCP:Flags=...A...F, SrcPort=1059, DstPort=2172, PayloadLen=0, Seq=4100066293, Ack=270593595, Win=65535 (scale factor 0x0) = 65535

- With this behavior we can conclude that the SSL handshake was not performed correctly.

After reviewing the windows date and time settings on both ISA Servers, we noticed that SRVISAWGN2 has a different date (two days behind) the correct date. It now makes total sense that the SSL handshake fails, since the certificate will appears as invalid because the issue date is ahead of the current system date. After fixing the date/time and redoing the LDP test, we were able to establish the SSL connection. Here it is the working LDP output:

ld = ldap_sslinit("srvisawgn1", 2172, 1);

Error <0x0> = ldap_set_option(hLdap, LDAP_OPT_PROTOCOL_VERSION, LDAP_VERSION3);

Error <0x0> = ldap_connect(hLdap, NULL);

Error <0x0> = ldap_get_option(hLdap,LDAP_OPT_SSL,(void*)&lv);

Host supports SSL, SSL cipher strength = 128 bits

Established connection to srvisawgn1.

Retrieving base DSA information...

Result <0>: (null)

Matched DNs:

Getting 1 entries:

>> Dn:

1> currentTime: 9/3/2009 7:5:38 Central America Standard Time Central America Standard Time;

1> subschemaSubentry: CN=Aggregate,CN=Schema,CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4};

1> dsServiceName: CN=NTDS Settings,CN=SRVISAWGN1$ISASTGCTRL,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4};

3> namingContexts: CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4}; CN=Schema,CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4}; CN=FPC2;

1> schemaNamingContext: CN=Schema,CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4};

1> configurationNamingContext: CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4};

22> supportedControl: 1.2.840.113556.1.4.319; 1.2.840.113556.1.4.801; 1.2.840.113556.1.4.473; 1.2.840.113556.1.4.528; 1.2.840.113556.1.4.417; 1.2.840.113556.1.4.619; 1.2.840.113556.1.4.841; 1.2.840.113556.1.4.529; 1.2.840.113556.1.4.805; 1.2.840.113556.1.4.521; 1.2.840.113556.1.4.970; 1.2.840.113556.1.4.1338; 1.2.840.113556.1.4.474; 1.2.840.113556.1.4.1339; 1.2.840.113556.1.4.1340; 1.2.840.113556.1.4.1413; 2.16.840.1.113730.3.4.9; 2.16.840.1.113730.3.4.10; 1.2.840.113556.1.4.1504; 1.2.840.113556.1.4.1852; 1.2.840.113556.1.4.802; 1.2.840.113556.1.4.1907;

2> supportedLDAPVersion: 3; 2;

12> supportedLDAPPolicies: MaxPoolThreads; MaxDatagramRecv; MaxReceiveBuffer; InitRecvTimeout; MaxConnections; MaxConnIdleTime; MaxPageSize; MaxQueryDuration; MaxTempTableSize; MaxResultSetSize; MaxNotificationPerConn; MaxValRange;

1> highestCommittedUSN: 61464;

4> supportedSASLMechanisms: GSSAPI; GSS-SPNEGO; EXTERNAL; DIGEST-MD5;

1> dnsHostName: SRVISAWGN1;

1> serverName: CN=SRVISAWGN1$ISASTGCTRL,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,CN={892853B5-003E-4C6D-87AF-BBAFC9814DE4};

3> supportedCapabilities: 1.2.840.113556.1.4.1851; 1.2.840.113556.1.4.1791; 1.2.840.113556.1.4.1880;

1> isSynchronized: TRUE;

1> forestFunctionality: 2 = ( DS_BEHAVIOR_WIN2003 );

1> domainControllerFunctionality: 2 = ( DS_BEHAVIOR_WIN2003 );

-----------

This change of system date also fixed the synchronization problem between both ISA Servers.

3. Scenario 2 – Web publishing rule just works when it goes through one member of the ISA Server Array in NLB

This scenario was about a simple web publishing rule using Kerberos Constrained Delegation (KCD), where sometimes the external client was able to authenticate and sometimes were getting the error message: 401 unauthorized, after typing or providing the credentials in the logon forms (FBA). Later in the troubleshooting, we identified that the problem was reproducible only when the traffic passed through one specific array member (ISASRV02). Figure 1 shows the diagram for this scenario:

Figure 3 - Topology using NLB and inconsistent traffic flow behavior.

After isolating that the issue only happened through ISASRV02 we got Network Monitor on the internal interface of ISASRV02 and we could see that ISA was correctly sending the Kerberos ticket request to the DC:

ISASRV02 DC01 KerberosV5 KerberosV5:TGS Request

- Kerberos: TGS Request

+ Length: Length = 5809

- TgsReq: Kerberos TGS Request

- ApplicationTag:

+ AsnId: Application Constructed Tag (12)

+ AsnLen: Length = 5805, LengthOfLength = 2

- KdcReq: KRB_TGS_REQ (12)

+ SequenceHeader:

+ Tag1:

+ Pvno: 5

+ Tag2:

+ MsgType: KRB_TGS_REQ (12)

+ Tag3:

+ PaData:

…and the DC was answering with the error below:

DC01 ISASRV02 KerberosV5 KerberosV5:KRB_ERROR - KRB_AP_ERR_SKEW (37)

- Kerberos: KRB_ERROR - KRB_AP_ERR_SKEW (37)

+ Length: Length = 103

- KrbError: KRB_ERROR (30)

+ ApplicationTag:

+ SequenceHeader:

+ Tag0:

+ PvNo: 5

+ Tag1:

+ MsgType: KRB_ERROR (30)

+ Tag4:

+ Stime: 08/13/2009 18:50:28 UTC

+ Tag5:

+ SuSec: 875196

+ Tag6:

+ ErrorCode: KRB_AP_ERR_SKEW (37)

+ Tag9:

+ Realm: CONTOSO.LOCAL

+ TagA:

+ Sname: host/isasrv.contoso.local

This error means: there is time discrepancy between client and server or client and KDC (definition from KB 230476). Interesting part is that ISASRV02 was only 3 minutes ahead of the DC, which “by default” shouldn’t be a problem (since the default is 5 minutes). However in this case the system administrator has changed the “Maximum tolerance for computer clock synchronization” group policy to 3 minutes. Fixing the timer on this ISA box fixed the authentication issue.

4. Conclusion

Time synchronization problem can be very painful to resolve because of a simple reason: most of the administrators overlook that setting and start a deeper troubleshooting. When dealing with ISA Server, it’s important to perform sanity check on the basic configuration and this includes verifying that the time on the ISA array members is in sync (with each other) or is synced with a time server which maybe a DC or an external time server.

Authors

Yuri Diogenes

Sr Security Support Escalation Engineer

Microsoft CSS Forefront Edge Team

Mohit Kumar

Security Support Engineer

Microsoft CSS Forefront Edge Team