The Riverbed Field Guide for the AD Admin

Unexpected TCP resets, intermittent “Network Path Not Found”, and SMB dialects being downgraded. These errors point to something very odd and potentially very bad, happening on the network. If you are like many AD administrators, at the first sign of network impropriety, you likely engage the network team and request the network issues be addressed. While on the call with the networking team, you may hear a very unexpected resolution to the issue. The team may request that you add their non-Windows network devices to your domain as an RODC (Read Only Domain Controller) or grant a service account permission to replicate with your domain controllers.

Before you hang up on the networking team or stop reading this blog, hear me out.  WAN optimizers and other caching technology’s such as Branch Cache can offer substantial improvement in WAN performance.  This is especially true in parts of the globe where high speed WAN connections are either cost prohibitive or non-existent.   For many companies, the benefits of these devices far outweigh the support impact and potential security implications.

This is Premier Field Engineer Greg Campbell, here to remove a little of the mystery about how Riverbed’s Steelhead appliances alter what we see on the wire, how best to reduce or prevent support issues and the significant security considerations when integrating them with the Active Directory.  Let’s start with a little background on the devices and how they operate.   Then we can address the security questions followed by some troubleshooting tips. Here’s what we are going to cover.

  • Riverbed Primer – What you need to know when looking at network captures
  • Security Implications – What you need to know before integrating the Steelheads with Active Directory
  • Troubleshooting info – Common issues you may encounter with Steelheads in the environment.

While the Steelheads can optimize HTTPS traffic (with access to the web site’s private key) and Mapi traffic, the focus of this blog will be on optimizing SMB traffic.

Before we go any further please review our recently published support boundaries on this topic: https://support.microsoft.com/en-us/kb/3192506

1 – Riverbed Primer

Riverbed Technology Inc is the manufacture of WAN optimization including the Steelhead branded products. Steelhead comes in many forms including appliance devices, a soft client for Windows and Mac as well as a cloud based solution.

The Steelheads can perform three levels of traffic optimization:

  • TCP optimization – This includes optimizations at the TCP layer, including optimizing TCP acknowledgment traffic, and TCP window size.
  • Scalable data reference (data deduplication) – Instead of sending entire data sets, only changes and references to previously sent data are sent over the WAN. This is most effective for with small changes to large files.
  • Application latency optimization – For applications such as file sharing (SMB), Steelhead appliances optimize the protocol by reducing protocol overhead and increasing data throughput.

The last one, application latency, can provide the most impactful performance gains. However, to accomplish application latency optimization, the SMB traffic either needs to be unsigned or the Steelheads will need to inject themselves into the signed client-server communication. I will address that massive can of worms I just cracked, later. For now, just remember that there are up to 3 levels of optimization that can be performed.

Three TCP Connections

Don’t confuse this with the traditional 3-way handshake. Its 3 separate TCP sessions each with their own 3-way handshake. I guess you could say it’s a 9-way handshake. Here are the 3 legs of the journey (see figure 1).

  1. Client to Steelhead (LAN) – The first session is between the client and the local Steelhead appliance operating in the client role. This traffic is not optimized and is a regular TCP session.
  2. Steelhead to Steelhead (WAN) The second session is TLS encrypted between the two Steelhead appliances. This TLS protected TCP session is a proxied connection that corresponds to the two outer TCP sessions to the client and the server. This traffic is invisible to the client and server and is where the bulk of the optimization takes place.
  3. Steelhead to Server (LAN) -The 3rd session is between the Steelhead operating in the server role. This traffic is not optimized and is a regular TCP session.

Figure 1 – The 3 separate TCP sessions

When looking at simultaneous network (client/server) network captures, there will be completely different TCP sequence numbers, packet ID numbers and possibly different packet sizes. This is because they are different TCP sessions. Though they are different TCP sessions, the Steelhead appliances use a feature called IP and port transparency. This means the IP and port on the client and the server are not altered by the optimization. This can be helpful when attempting to align the client side and server side conversions in the two network captures.

While Steelhead appliances do not have configured roles, to help explain traffic flow and AD requirements, the Steelhead nearest the client is the in the “client role”, or C-SH. The Steelhead nearest the server, is in the “server role” or S-SH. These roles would be reversed if the traffic were reversed. For example, a server in the data center accessing the client for system management etc.

Steelhead Bypass Options

There are times when traffic cannot or should not be optimized by the Steelheads. During troubleshooting, it can be helpful to try bypassing the Steelheads to determine if they are involved in the issue. Because there are different ways to bypass the Steelheads, it’s helpful for the AD admin to know which method of bypass was used. Especially if the bypass method didn’t address the issue. Most of the bypass methods do not completely bypass the Steelheads nor disable all levels of optimization. If the issue persists after bypassing has been enabled, it may be necessary to use a different bypass method.

Steelhead’s IP Blacklist

The Steelhead appliances will attempt to optimize all SMB traffic that is not already excluded. If the traffic cannot be application latency optimized, the IP addresses are put in dynamic exclusion list called a “blacklist” for 20 minutes. The next time a connection is attempted, the Steelhead will allow the traffic to bypass latency optimization. If a single IP address appears on the blacklist several times, it will be put onto a long-term blacklist, so the first failure is avoided. The long-term blacklists will persist until the Steelhead is rebooted or its manually cleared by an administrator.

In the case of signed SMB traffic that cannot be application latency optimized, first attempt to connect will likely fail. In the client side capture, you may see an unexpected TCP Reset from the server. At the next attempt to connect, the Steelheads only perform the first two levels of optimization and the connection typically succeeds. The blacklist is short lived (20 minutes). If the client application silently retries, the user may not see any issue. However, some users may report that sometimes it works and sometimes it doesn’t. Or they may have noticed that first attempt fails but the second works. Keep this in mind when troubleshooting intermittent network issues.

In-Path Bypass rules and Peering rules

If the Steelhead appliances are not AD integrated, the best option is to have the Steelhead administrators exclude the traffic from optimization by adding an In-Path, pass-through, rule on the Steelheads. Because domain controllers always sign SMB traffic, In-Path, Pass-Through rules are recommended for Domain Controllers. On the Steelheads operating in the client role, create an in-path rule to pass-through all traffic to the domain controllers. For more information, see:

https://support.riverbed.com/bin/support/static/bkmpofug7p1q70mbrco0humch1/html/ocgj3m4oc178q0cigtfufl0m68/sh_ex_4.1_ug_html/index.html#page/sh_ex_4.1_ug_htm/setupServiceInpathRules.html

Optionally, a peering rule can be applied on the Server side Steelhead. Work with your Riverbed support professional to determine which method is recommended for your environment.

Riverbed Interceptor and Bypass Options

For troubleshooting, it may be necessary to completely bypass the Steelhead appliances for testing. If the environment is using the load distribution device “Riverbed Interceptor”, the by-pass rule can be added either on the Inceptor or on the Steelheads. The interceptor balances optimization traffic across the Steelhead appliances within the site. The interceptor can be configured to bypass all the Steelhead appliances, sending traffic un-optimized directly to the router. If the bypass rule is set on the Steelheads instead of the interceptor, the traffic may still be TCP and SDR (Scalable data reference) optimized. When deciding which bypass method to use, note the following:

  1. In-path, bypass rules for the Steelheads. When configured on the Steelhead, only application latency optimization is disabled. TCP optimization and scalable data reference optimization are still active. This still provides a measure of optimization. When troubleshooting, this method along may still introduce issues.
  2. In-path, bypass rules for Interceptors. Configured on the Riverbed Interceptors, when these rules are enabled, the traffic will completely bypass the Steelhead appliances. No optimization is performed. Setting the bypass rule on the Interceptor may be required in some troubleshooting scenarios.

Figure 2 – Bypass rules configured at the interceptor

2 – Security Implications

In this section, we review the security impact of integrating Steelheads with Active Directory. This integration is required to fully optimize signed SMB traffic.

Why SMB traffic Should Be Signed

Consider the scenario where a client’s traffic is intercepted and relayed to another host. An adversary is acting as a man-in-the-middle (MitM) and the connection information may be used connect to another resource without your knowledge. This why it’s important that the client have some way to verify the identity of the host its connecting to. For SMB, the solution to this problem goes back as far as Windows NT 4.0 SP3 and Windows 98. SMB signing is the method used to cryptographically sign the SMB traffic. This is accomplished by using a session key that is derived from the authentication phase, during SMB session setup. The file server uses its long-term key, aka, computer account password, to complete this phase and prove its identity.

When traffic is “application optimized”, the Steelhead appliances operate as an authorized man-in-the-middle. They are intercepting, repackaging and then unpacking traffic to and from the real server. To sign this traffic, the Steelhead operating in the server role needs access to the session key. Without AD integration, the Steelhead appliances do not have access to the server’s long-term keys and cannot obtain the session key. The Steelhead is unable to sign the SMB Session setup packets and cannot prove it is the server the client had intended to communicate with. SMB signing is doing its job and preventing the man-in-the-middle. For the Steelheads, this protection means the traffic cannot be application latency optimized.

Signing with SMB3 and Windows 10 (Secure Negotiate)

SMB3 added another layer of protection called secure negotiate. During the client / server SMB negotiation, the client and server agree on the highest supported SMB dialect. The server response contains the negotiated SMB dialect and with field in the packet containing the dialect value signed by the server. This step ensures that the SMB dialect cannot be downgraded to an older, weaker dialect by a man-in-the-middle. Compared to traditional SMB signing, far less of the SMB packet is signed. However, this still causes an issue for the Steelheads that are not able to sign the field containing the negotiated dialect.

In Windows 8, it was possible to disable Secure Negotiate. While unadvisable, many admins opted to restore WAN performance at the expense of SMB security. In Windows 10 the capability to downgrade security by disabling Secure Negotiate, is no longer available. The leaves only two options for Windows 10:

  • Integrate the Steelhead appliance (server role in the datacenter) with AD.
  • Bypass the Steelhead appliances for application optimization. This significantly reduces the effectiveness of the Steelhead appliances in some scenarios.

NOTE: When a Windows 10 client (SMB 3.1.1) is communicating with a down-level server, the client will use Secure Negotiate. However, when a Windows 10 client is communicating with a server running SMB 3.1.1, it will use the more advanced variant, Pre-Authentication Integrity. Pre-Authentication Integrity also uses signed packets and has the same Steelhead requirements as Secure Negotiate. For more information, see https://blogs.msdn.microsoft.com/openspecification/2015/08/11/smb-3-1-1-pre-authentication-integrity-in-windows-10/

RIOS versions 9.1.3 and below, currently do not support Pre-Authentication Integrity. Contact your Riverbed Support professional for options if this scenario applies your environment

For an in-depth discussion of the Secure Negotiate feature, see: https://blogs.msdn.microsoft.com/openspecification/2012/06/28/smb3-secure-dialect-negotiation/

Steelhead options for Optimizing Signed SMB Traffic

Now we know that Steelhead needs the server’s long term key (computer account password), to sign the response and subsequently fully “application optimize” the traffic. The required permissions vary by authentication protocol (NTLM or Kerberos) and RIOS version. For this discussion, we will focus on RIOS version 7 and above.

AD Integration – Kerberos

The Steelheads use one of two methods to obtain the session keys needed to sign optimized SMB traffic. To optimize SMB traffic authenticated using NTLM, the Steelheads use a computer account with RODC privileges. More on this in the next section. To optimize traffic authenticated using Kerberos, the Steelheads do not need an RODC account but instead use a highly privileged service account along with a workstation computer account. While this mode requires far more privileges than the RODC approach, it is required for optimizing Kerberos authentication sessions. Additionally, this method does not suffer from the RODC account related issues discussed in the troubleshooting section.

As of RIOS version 7.0 and above, the “End-To-End Kerberos Mode” or eeKRB is recommended by Riverbed. This mode replaces the deprecated “Constrained Delegation Mode” in older version of RIOS. The account requirements for Steelhead operating in the server role are:

  1. A Service Account with the following permission on root of each domain partition containing servers to optimize.

    “Replicate Directory Changes”

    “Replicate Directory Changes All”.

  2. The Steelheads need to be to be joined to at least one domain in the forest as a workstation. Not as an RODC.

The “Replicate Directory Changes All” permission grants the Steelhead service account access replicate directory objects including domain secrets. Domain secrets include the attributes where account password hashes are stored. This includes the password hashes for the file servers whose traffic is to be optimized, as well as hashes for users, administrators, trusts and domain controllers.

The other critical piece of the “Replicate Directory Changes All” permission is that it bypasses the RODC’s password replication policy. There is no way to constrain access only to the computer accounts to be optimized.

In most environments, all SMB traffic should already be using Kerberos for authentication. In these environments, it is not necessary to configure the Steelheads with an RODC account. This is important as it will prevent many of the issues with the RODC account approach. The security impact of this configuration is considerable and needs careful planning and risk mitigation. More on this below.

AD Integration – NTLM

To support NTLM authentication, the Steelhead operating on the server role will need to be joined as a workstation and the UserAccountControl set to 83955712 (Dec), or 0x5011000 (Hex). This maps to the following capabilities:

  • PARTIAL_SECRETS_ACCOUNT
  • TRUSTED_TO_AUTH_FOR_DELEGATION
  • WORKSTATION_TRUST_ACCOUNT
  • DONT_EXPIRE_PASSWORD

The UserAccountControl settings grant access to the partial secrets. These are the same permissions that are used by RODCs to replicate the domain secrets for accounts included in the RODC Password Replication Policy for the account. The flag TRUSTED_TO_AUTH_FOR_DELEGATION allows the account to perform protocol transition. KB3192506 describes the risk:

“The account can impersonate any user in the Active Directory except those who are marked as “sensitive and not allowed for delegation.” Because of Kerberos Protocol transition, the account can do this even without having the password for impersonation-allowed users.”

UserAccountControl value of 0x5011000 will identify the account as an RODC. This cause issues for tools that query for DCs based on the UserAccountControl value. See the section below for more details Troubleshooting issues with UserAccountControl set to 0x5011000.

Security Implications of AD Integration – What is exposed?

The decision to integrate Steelhead should be the outcome of a collaboration between a company’s security team, the company’s networking team and the company’s Active Directory team. To inform this discussion, consider the value of the data being shared with the Steelhead appliances along with the resulting “effective control” of that data. Microsoft’s current guidance regarding protection of data and securing access can be found in the Securing Privileged Access Reference. The reference defines a concept called the Clean Source Principle.

“The clean source principle requires all security dependencies to be as trustworthy as the object being secured.

“Any subject in control of an object is a security dependency of that object. If an adversary can control anything in effective control of a target object, they can control that target object. Because of this, you must ensure that the assurances for all security dependencies are at or above the desired security level of the object itself. “

Source: https://technet.microsoft.com/windows-server-docs/security/securing-privileged-access/securing-privileged-access-reference-material#a-name-csp-bm-a-clean-source-principle

In this case, the data is the secrets for all domains in the forest. Because a domain controller only holds secrets for its own domain, the service account has higher privileges than any single DC. The service account can authenticate as any user in the forest to any resource within the forest or any trusting forest where permissions are granted. This means the service account has either direct or indirect control of all data in the forest as well as any trusting forests (where permissions are granted). To put this in perspective, the service account can act as any account (user, computer, DC) to access any resources in the forest including all servers containing high value IP, email, financial reports, etc. Since the Riverbed administrators have control of the service account, these admins now have indirect control of the entire forest.

Below are some guidelines for securing privilege access. While this applies to domain controllers and domain administrators, these practices and requirements may be extended to any highly privileged device or account. When reviewing the requirements below, consider these questions:

Can the Steelhead environment, including appliances, management workstations and all accounts with direct or indirect access be secured to the same level that the DCs and domain administrators?

If so, what is the financial impact of both initial and ongoing operational changes needed to secure the Steelhead environment?

Finally, do the optimization benefits outweigh the increased operation cost and potential increase in risk to the forest?

In order to secure any accounts or appliance that have this level of privilege the Securing Privileged Access Reference provides a good starting point.


Tier 0 administrator – manage the identity store and a small number of systems that are in effective control of it, and:

  • Can only log on interactively or access assets trusted at the Tier 0 level
  • Separate administrative accounts
  • No browsing the public Internet with admin accounts or from admin workstations
  • No accessing email with admin accounts or from admin workstations
  • Store service and application account passwords in a secure location
    • physical safe,
    • Ensure that all access to the password is logged, tracked, and monitored by a disinterested party, such as a manager who is not trained to perform IT administration.
  • Enforce smartcard multi-factor authentication (MFA) for all admin accounts, no administrative account can use a password for authentication.
  • A script should be implemented to automatically and periodically reset the random password hash value by disabling and immediately re-enabling the attribute “Smart Card Required for Interactive Logon.”

Additional guidance comes from “Securing Domain Controllers Against Attack

https://technet.microsoft.com/en-us/windows-server-docs/identity/ad-ds/plan/security-best-practices/securing-domain-controllers-against-attack

Highlights include

  • In datacenters, physical domain controllers should be installed in dedicated secure racks or cages that are separate from the general server population.
  • OS Updates and Patch Deployments – To ensure known security vulnerable are not exploited all appliances should be kept up to date in accordance with the manufacturer’s guidance. This may include running the current operating system and patch level.
  • Remote access such RPD or SSH highly restricted to only permitted from secured admin workstations.

The above list is not an endorsement of Steelhead security, but serves as a starting point to plan the security of extending Tier 0 beyond the DCs and domain admins. Additional, mitigations should include timely patching, complex password that is regularly changed and mitigations provided by Riverbed.

3 – Troubleshooting Network Issues when Steelheads are involved

This section will cover common issues and troubleshooting guidance. Troubleshooting with Steelheads generally follows this flow:

  • Are Steelheads involved in the conversation? Even if Steelheads are present in the environment, they may not be causing the networking issue. They may not be deployed everywhere in the environment or Steelheads can be configured to bypass some traffic. So even if they are in use and in the network path they may not be negatively affecting the network traffic.
  • Are the Steelheads the cause of the network issue? Network issues have existed long before Steelheads arrived. Careful diagnosis is required. The Steelheads may be optimizing traffic but not causing the network issue. When in doubt, have the Steelhead administrators bypass the traffic for the affected machines for testing.
  • Test by bypassing the Steelheads. When the Steelheads are determined to be the cause of networking issues, they can be bypassed in one of several ways. See the bypass section for more details.

Detecting Riverbed Probe Info in captures

When troubleshooting connectivity issues, if you suspect that a Steelhead appliance is involved, there are several ways to detect Steelhead appliances. Most of them require a simultaneous capture on the client and the server. With these two methods, the capture is only performed on one side.

Riverbed Probe Info

The capture must be from the server side and it must include the SYN packet from the client. Locate the SYN packet in the capture and check for option 76. This packet will also show which Steelhead the client accessed. This can be helpful when engaging the networking team. Wireshark parsers show this as a Riverbed Probe Query. Use the display filter ‘tcp.options.rvbd.probe’

Internet Protocol Version 4, Src: <source IP Adderss>, Dst: <Destination IP address>

Transmission Control Protocol, Src Port: <source port>, Dst Port: <destination port>, Seq: 0, Len: 0

Riverbed Probe, Riverbed Probe, No-Operation (NOP), End of Option List (EOL)

No-Operation (NOP)

Riverbed Probe: Probe Query, CSH IP: <Steelhead IP Address>

Kind: Riverbed Probe (76)

Reserved: 0x01

CSH IP: <Steelhead IP Address>

Riverbed Probe: Probe Query Info

Probe Flags: 0x05

For a complete list of Riverbed display filters, see https://www.wireshark.org/docs/dfref/t/tcp.html

Search for filters starting with “tcp.options.rvbd”. Or enter “tcp.options.rvbd” in the Wireshark filter field and the list of available filters will be displayed.

Detecting Steelheads using the TTL / Hoplimit values in the IP header

The IP Header each packet contains a TTL value for Ipv4. For IPv6 this is called Hoplimit. The default value for Windows systems is 128. The value is decremented by 1 each time the packet traverses a router. When the packets originate from non-Windows systems, the value is often considerably lower. For Steelheads, this value starts at 64 and is typically 60 to 63 after the packet traverses a router or two. While a value of 64 below does not always mean the packet traversed a Steelhead Appliance. But of the packet was sent by a Windows system, it’s clear the TTL was modified in-route and the Steelhead Appliance may be source of that change.

IPv4 TTL Example

Ipv4: Src = XXXXX Dest = XXXXX 3, Next Protocol = TCP, Packet ID = 62835, Total IP Length = 1400

  + Versions: IPv4, Internet Protocol; Header Length = 20

  + DifferentiatedServicesField: DSCP: 0, ECN: 0

    TotalLength: 1400 (0x578)

    Identification: 62835 (0xF573)

  + FragmentFlags: 16384 (0x4000)

    TimeToLive: 64 (0x40)

    NextProtocol: TCP, 6(0x6)

    Checksum: 22330 (0x573A)

    SourceAddress: XXXXX

    DestinationAddress: XXXXX

IPv6 Hoplimit Example

+ Ethernet: Etype = IPv6,DestinationAddress:[XXXXX],SourceAddress:[XXXXX]

– Ipv6: Next Protocol = TCP, Payload Length = 272

+ Versions: IPv6, Internet Protocol, DSCP 0

PayloadLength: 272 (0x110)

NextProtocol: TCP, 6(0x6)


HopLimit: 64 (0x40)

SourceAddress: XXXXX

DestinationAddress: XXXXX

Detecting Steelheads with Simultaneous network captures

In many cases, simultaneous captures will be required to verify where the failure occurred in the conversation. In the capture, look for the following to determine if the traffic is traversing Steelhead appliances. Align the captures by IP address and port number. Then align the start of the conversation using common packet such as “SMB Negotiate” that is present in both captures. If the packets traverse a Steelhead, you will likely see:

  • The SMB Session ID does not match.
  • The packet size and sequence numbers for the same traffic do not match between the two captures.
  • Additionally, you may see a TCP reset at the client that was never sent by the server. This can happen if the conversation is signed and the Steelheads are not AD Integrated. See Steelhead’s IP Blacklist earlier in this blog for more info.

SMB Negotiates down to 2.02

Steelheads have a mode called “Basic Dialect” which is often used to disable client leasing and force the use of oplocks. In this mode, the Steelheads intercept and modify available dialects supported by the server. While this behavior may not cause any issues for SMB 2.02 supported capabilities, it does mean that SMB3 capabilities will be disabled. Here is list capabilities that will be disabled in this mode from https://support.microsoft.com/en-us/kb/2709568 and https://blogs.technet.microsoft.com/josebda/2015/05/05/whats-new-in-smb-3-1-1-in-the-windows-server-2016-technical-preview-2/

  • SMB Transparent Failover
  • SMB Scale Out
  • SMB Multichannel
  • SMB Direct
  • SMB Encryption including support for AES-128-GCM
  • VSS for SMB file shares
  • SMB Directory Leasing
  • SMB PowerShell
  • Cluster Dialect Fencing
  • Pre-Authentication Integrity

In this example of network capture, a Server 2012 R2 client is connecting to a Server 2012 R2 Server. The traffic is signed and not latency optimized. This is the packet that left the client. Notice the SMB dialect offered by the client is 202 through 302.

SMB2 (Server Message Block Protocol version 2)

SMB2 Header

Negotiate Protocol Request (0x00)

StructureSize: 0x0024

Dialect count: 4

Security mode: 0x01, Signing enabled

Reserved: 0000

Capabilities: 0x0000007f, DFS, LEASING, LARGE MTU, MULTI CHANNEL, PERSISTENT HANDLES, DIRECTORY LEASING, ENCRYPTION

Client Guid: 5c050930-680d-11e6-80d0-9457a55aefad

NegotiateContextOffset: 0x0000

NegotiateContextCount: 0

Reserved: 0000

Dialect: 0x0202

Dialect: 0x0210

Dialect: 0x0300

Dialect: 0x0302

Here is same packet when it arrived at the server. Notice that the Steelhead has removed all available dialects except 202.

Altered packet that arrives to the Server

SMB2 (Server Message Block Protocol version 2)

SMB2 Header

Negotiate Protocol Request (0x00)

StructureSize: 0x0024

Dialect count: 4

Security mode: 0x01, Signing enabled

Reserved: 0000

Capabilities: 0x0000007f, DFS, LEASING, LARGE MTU, MULTI CHANNEL, PERSISTENT HANDLES, DIRECTORY LEASING, ENCRYPTION

Client Guid: 5c050930-680d-11e6-80d0-9457a55aefad

NegotiateContextOffset: 0x0000

NegotiateContextCount: 0

Reserved: 0000

Dialect: 0x0202

Dialect: 0x0000

Dialect: 0x0000

Dialect: 0x0000

Troubleshooting

This section covers common troubleshooting scenarios where Steelheads are involved in the conversation.

Intermittent Connectivity – Knock twice to enter

While intermittent connectivity can be a transient network issue, the effects of Steelheads have a very specific pattern. The first attempt to connect fails with a “Network Path Not found” or other generic network failure. A network capture on the client will show either an expected TCP reset from the server, or an unexpected Ack, Fin terminating the connection. A capture taken at the server will show that these packets were never sent from the server.

The second (or sometimes the third attempt) to connect is successful. The connection works for over 20 minutes and then may fail again. Retrying the connection twice more establishes the connection once again. This pattern occurs when the SMB session or negotiated dialect field is signed and the Steelheads are not AD Integrated. The Steelhead is attempting to perform all 3 levels of optimization and is not able to perform the last one, Application Latency, due the signing requirements. The Steelhead then puts the clients and servers’ s IP addresses on a temporary bypass list with a lifetime of 20 minutes. The existing session must be torn down and new session established with bypass in place. The TCP reset is what triggers the tear down. The next time the client retries the connection, the Steelheads will not attempt application latency optimization and the operation succeeds.

DC Promo Failure

During promotion of a 2012 R2 Domain controller using a 2012 R2 helper DC (both using SMB 3.0), the candidate DC receives an unexpected A…F (Ack, Fin) packet while checking for the presence of its machine account on the helper DC. The following is logged in the DCpromoUI log

Calling DsRoleGetDcOperationResults

Error 0x0 (!0 => error)

Operation results:

OperationStatus : 0x6BA !0 => error

DisplayString : A domain controller could not be contacted for the domain contoso.com that contained an account for this computer. Make the computer a member of a workgroup then rejoin the domain before retrying the promotion.

After receiving the unexpected A..F packet from the helper DC, the candidate DC hits the failure and does not retry. This happens even if the promotion is using IFM (install from media). In many of these scenarios, the servers are added to the dynamic bypass list called the blacklist. However, in this scenario, retrying the operation still fails. To address this issue, manually configuring a bypass rule on the Steelheads is required.

Issues with UserAccountControl set to 0x5011000

This section convers the issues that may be present when joining the Steelhead appliances with UserAccountControl set to 0x5011000 (Hex). When this value is set on the User Account Control Attribute, the Steelhead will appear to many tools an RODC. Because the Steelhead does not provide RODC services, several tools and processes encounter failures.

Setting the UserAccountControl set to 0x5011000 (Hex) is only necessary when NTLM is used between the client and the server. If Kerberos is used to authenticate the user, then the Steelhead can be joined as a regular workstation. To avoid the issues below, consider using the Steelhead to only optimize traffic when Kerberos was used.

AD Tools Detect Steelhead Accounts as DCs

Tools that rely on UserAccountControl values to enumerate DCs in a domain will find Steelhead Appliances joined with UserAccountControl set to 83955712 (Dec), or 0x5011000 (Hex). Some examples are:

  • nltest /dclist:domain.com
  • [DirectoryServices.Activedirectory.Forest]::GetCurrentForest().domains | %{$_.domaincontrollers} | ft name,OSVersion,Domain
  • DFSR Migration tool dfsrmig.exe will find the Steelhead accounts and prevent the transition to the “eliminated”. This is because the tool expects the account to respond as a DC and report its migration state when queried. The migration state cannot transition to its final state until the Steelhead accounts are removed from the domain.

Pre-created RODC Accounts and Slow logon

When pre-creating machine account for the Steelhead, the account should be created as a regular workstation, not an RODC. When an RODC account is pre-created, the process also creates a server object and NTDS Settings object in the Sites container. The Steelhead machine account may be discovered by the DFS service hosting SYSVOL and provide the server name in the DFS referrals for SYSVOL. This can contribute to a slow logon experience as clients try and fail to connect to the Steelhead appliance.

The partition knowledge table (dfsutil /pktinfo) might show a list like this with the Riverbed appliance at top:

Entry: \contoso.com\sysvol

hortEntry: \contoso.com\sysvol

Expires in 752 seconds

UseCount: 0 Type:0x1 ( DFS )

0:[\Steelhead1.contoso.com\sysvol] AccessStatus: 0xc00000be ( TARGETSET )

1:[\DC1.contoso.com\sysvol] AccessStatus: 0 ( ACTIVE )

2:[\DC2.contoso.com\sysvol] AccessStatus: 0

To prevent this issue, do not use the RODC precreation wizard to create Steelhead machine accounts. Create the account as a workstation and then modify the UserAccountControl values. To recover from this state, delete the NTDS settings for the Steelhead accounts. Note that it will take some time for the DFS caches on the client and domain root to timeout and refresh.

DSID Error Viewing the Steelhead Machine Account.

When viewing the Steelhead machine account with AD Users and Computers, or ADSIEdit on a Server 2008 R2, you may encounter the error below. This error occurs because the UserAccount Control values make it appear as an RODC. The interface then attempts to querying for NtdsObjectName which it cannot find because regular workstations do would not have NTDS Object.

This issue does not occur on using LDP, on 2008 R2. Server 2012 R2 and the latest RSAT tools are also not affected.

Domain Join Failure when not using a Domain Admin Account

The Steelhead appliance can be joined to a domain, as a workstation, or as an account with RODC flags. During the operation, the credentials entered on the Steelhead UI will be used to modify the UserAccountControl value as well as add Service Principal Names. There are two operations that may fail, and here’s why:

UserAccountControl – With security enhancements in MS15-096, modification of the UserAccountControl value by non-admins is no longer permitted.

The Steelhead log may report:

Failed to join domain: user specified does not have admin privilege

This is caused by a security change in MS15-096 that will prevent change of all flags in UserAccountControl that change Account Type for non-administrators.

3072595 MS15-096: Vulnerability in Active Directory service could allow denial of service: September 8, 2015 http://support.microsoft.com/kb/3072595/EN-US

ServicePrincipalName – The SPNs being written will be for the HOST/Steelhead account and the HOST/SteelheadFQDN. The account writing the SPNs is not the same as the Steelhead’s computer account and will not be able to update the SPN. The Steelhead log may report:

Failed to join domain: failed to set machine spn

Entering domain administrator credentials in the Steelhead domain join UI is not recommended for security reasons. Additionally, it may not even be possible for accounts that require smartcards for logon.

Workaround

  • A workaround is to pre-create the computer account, set the correct UserAccountControl values and give the user account that is joining the Steelheads full control over the account.
  • During domain join using the Steelhead UI, the default container “cn=comptuters” is used. If the pre-created account is in a different OU, the domain join must be performed using the CLI commands on the Steelhead. Refer to the Riverbed documentation for the syntax
  • The pre-created accounts will need to have the UserAccountValue set correctly before the Steelhead attempts to join the domain. Use the following values:
    • Joining as a Workstation = 69632 (Dec) or 0x11000 (Hex)
    • Joining with RODC flags = 83955712 (Dec), or 0x5011000 (Hex).

Summary

For many environments, WAN optimization provides significant improvements over existing links. This is especially true for regions where high speed, low latency WAN capabilities are either cost prohibitive or non-existent. To support these capabilities in a secure way, requires collaboration between groups which may have historically operated more independently.

After thoroughly evaluating the risks associated with AD integration, some environments may find the cost of risk mitigation, and operational overhead prohibitive. In some cases, it will simply not be possible to mitigate the risks to an acceptable level. While in other environments, the exercise of mitigating risks may improve the overall security of the environment.

References

* Performance Brief – Signed SMB311 https://splash.riverbed.com/docs/DOC-5622

Riverbed Unsigned SMB3 Performance Brief https://splash.riverbed.com/docs/DOC-3198

How WAN Optimization Works – http://www.riverbednews.com/2014/11/how-wan-optimization-works/

Technical Overview of RiOS 8.5 https://splash.riverbed.com/docs/DOC-1198

Steelhead RiOS 9.0 Technical Overview https://splash.riverbed.com/docs/DOC-5505

Configuring Steelhead In-Path Rules – https://support.riverbed.com/bin/support/static/bkmpofug7p1q70mbrco0humch1/html/ocgj3m4oc178q0cigtfufl0m68/sh_ex_4.1_ug_html/index.html#page/sh_ex_4.1_ug_htm/setupServiceInpathRules.html

Secure Negotiate – https://blogs.msdn.microsoft.com/openspecification/2012/06/28/smb3-secure-dialect-negotiation/

SMB Preauthentication Integrity –

https://blogs.msdn.microsoft.com/openspecification/2015/08/11/smb-3-1-1-pre-authentication-integrity-in-windows-10/

RODC Technical Reference Topics (Includes information on “domain secrets”)

https://technet.microsoft.com/sv-se/library/cc754218(v=ws.10).aspx

Third-party information disclaimer

The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.