Access-Based Enumeration (ABE) Concepts (part 1 of 2)

Hello everyone, Hubert from the German Networking Team here.  Today I want to revisit a topic that I wrote about in 2009: Access-Based Enumeration (ABE)

This is the first part of a 2-part Series. This first part will explain some conceptual things around ABE.  The second part will focus on diagnostic and troubleshooting of ABE related problems.  The second post is here.

Access-Based Enumeration has existed since Windows Server 2003 SP1 and has not change in any significant form since my Blog post in 2009. However, what has significantly changed is its popularity.

With its integration into V2 (2008 Mode) DFS Namespaces and the increasing demand for data privacy, it became a tool of choice for many architects. However, the same strict limitations and performance impact it had in Windows Server 2003 still apply today. With this post, I hope to shed some more light here as these limitations and the performance impact are either unknown or often ignored. Read on to gain a little insight and background on ABE so that you:

  1. Understand its capabilities and limitations
  2. Gain the background knowledge needed for my next post on how to troubleshoot ABE

Two things to keep in mind:

  • ABE is not a security feature (it’s more of a convenience feature)
  • There is no guarantee that ABE will perform well under all circumstances. If performance issues come up in your deployment, disabling ABE is a valid solution.

So without any further ado let’s jump right in:

What is ABE and what can I do with it?
From the TechNet topic:

“Access-based enumeration displays only the files and folders that a user has permissions to access. If a user does not have Read (or equivalent) permissions for a folder, Windows hides the folder from the user’s view. This feature is active only when viewing files and folders in a shared folder; it is not active when viewing files and folders in the local file system.”

Note that ABE has to check the user’s permissions at the time of enumeration and filter out files and folders they don’t have Read permissions to. Also note that this filtering only applies if the user is attempting to access the share via SMB versus simply browsing the same folder structure in the local file system.

For example, let’s assume you have an ABE enabled file server share with 500 files and folders, but a certain user only has read permissions to 5 of those folders. The user is only able to view 5 folders when accessing the share over the network. If the user logons to this server and browses the local file system, they will see all of the files and folders.

In addition to file server shares, ABE can also be used to filter the links in DFS Namespaces.

With V2 Namespaces DFSN got the capability to store permissions for each DFSN link, and apply those permissions to the local file system of each DFSN Server.

Those NTFS permissions are then used by ABE to filter directory enumerations against the DFSN root share thus removing DFSN links from the results sent to the client.

Therefore, ABE can be used to either hide sensitive information in the link/folder names, or to increase usability by hiding hundreds of links/folders the user does not have access to.

How does it work?
The filtering happens on the file server at the time of the request.

Any Object (File / Folder / Shortcut / Reparse Point / etc.) where the user has less than generic read permissions is omitted in the response by the server.

Generic Read means:

  • List Folder / Read Data
  • Read Attributes
  • Read Extended Attributes
  • Read Permissions

If you take any of these permissions away, ABE will hide the object.

So you could create a scenario (i.e. remove the Read Permission permission) where the object is hidden from the user, but he/she could still open/read the file or folder if the user knows its name.

That brings us to the next important conceptual point we need to understand:

ABE does not do access control.

It only filters the response to a Directory Enumeration. The access control is still done through NTFS.

Aside from that ABE only works when the access happens through the Server Service (aka the Fileserver). Any access locally to the file system is not affected by ABE. Restated:

“Access-based enumeration does not prevent users from obtaining a referral to a folder target if they already know the DFS path of the folder with targets. Permissions set using Windows Explorer or the Icacls command on namespace roots or folders without targets control whether users can access the DFS folder or namespace root. However, they do not prevent users from directly accessing a folder with targets. Only the share permissions or the NTFS file system permissions of the shared folder itself can prevent users from accessing folder targets.” Recall what I said earlier, “ABE is not a security feature”. TechNet

ABE does not do any caching.

Every requests causes a filter calculation. There is no cache. ABE will repeat the same exact work for identical directory enumerations by the same user.

ABE cannot predict the permissions or the result.

It has to do the calculations for each object in every level of your folder hierarchy every time it is accessed.

If you use inheritance on the folder structure, a user will have the same permission and thus the same filter result from ABE through the entire folder structure. Still ABE as to calculate this result, consuming CPU Cycles in the process.

If you enable ABE on such a folder structure you are just wasting CPU cycles without any gain.

With those basics out of the way, let’s dive into the mechanics behind the scenes:

How the filtering calculation works

  1. When a QUERY_DIRECTORY request (https://msdn.microsoft.com/en-us/library/cc246551.aspx) or its SMB1 equivalent arrives at the server, the server will get a list of objects within that directory from the filesystem.
  2. With ABE enabled, this list is not immediately sent out to the client, but instead passed over to the ABE for processing.
  3. ABE will iterate through EVERY object of this list and compare the permission of the user with the objects ACL.
  4. The objects where the user does not have generic read access are removed from the list.
  5. After ABE has completed its processing, the client receives the filtered list.

This yields two effects:

  • This comparison is an active operation and thus consumes CPU Cycles.
  • This comparison takes time, and this time is passed down to the User as the results will only be sent, when the comparisons for the entire directory are completed.

This brings us directly to the core point of this Blog:
In order to successfully use ABE in your environment you have to manage both effects.

If you don’t, ABE can cause a wide spread outage of your File services.

The first effect can cause a complete saturation of your CPUs (all cores at 100%).

This does not only increase the response times of the Fileserver to its clients to a magnitude where the Server is not accepting any new connections or the clients kill their connection after not getting a response from the server for several minutes, but it can also prevent you from establishing a remote desktop connection to the server to make any changes (like disabling ABE for instance).

The second effect can increase the response times of your fileserver (even if its otherwise Idle) to a magnitude that is not accepted by the Users anymore.

The comparison for a single directory enumeration by a single user can keep one CPU in your server busy for quite some time, thus making it more likely for new incoming requests to overlap with already running ABE calculations. This eventually results in a Backlog adding further to the delays experienced by your clients.

To illustrate this let’s roll some numbers:

A little disclaimer:

The following calculation is what I’ve seen, your results may differ as there are many moving pieces in play here. In other words, your mileage may vary. That aside, the numbers seen here are not entirely off but stem from real production environments. Performance of Disk and CPU and other workloads play into these numbers as well.

Thus the calculation and numbers are for illustration purposes only. Don’t use it to calculate your server’s performance capabilities.

Let’s assume you have a DFS Namespace with 10,000 links that is hosted on DFS Servers that have 4 CPUs with 3.5 GHz (also assuming RSS is configured correctly and all 4 CPUs are used by the File service: https://blogs.technet.microsoft.com/networking/2015/07/24/receive-side-scaling-for-the-file-servers/ ).

We usually expect single digit millisecond response times measured at the fileserver to achieve good performance (network latency obviously adds to the numbers seen on the client).

In our scenario above (10,000 Links, ABE, 3.5 Ghz CPU) it is not unseen that a single enumeration of the namespace would take 500ms.

CPU cores and speed DFS Namespace Links RSS configured per recommendations ABE enabled? Response time
4 @ 3.5 GHz 10,000 Yes No <10ms
4 @ 3.5 GHz 10,000 Yes Yes 300 – 500 ms

That means a single CPU can handle up to 2 Directory Enumerations per Second. Multiplied by 4 CPUs the server can handle 8 User Requests per Second. Any more than those 8 requests and we push the Server into a backlog.

Backlog in this case means new requests are stuck in the Processor Queue behind other requests, therefore multiplying the wait time.

This can reach dimensions where the client (and the user) is waiting for minutes and the client eventually decides to kill the TCP connection, and in case of DFSN, fail over to another server.

Anyone remotely familiar with Fileserver Scalability probably instantly recognizes how bad and frightening those numbers are.  Please keep in mind, that not every request sent to the server is a QUERY_DIRECTORY request, and all other requests such as Write, Read, Open, Close etc. do not cause an ABE calculation (however they suffer from an ABE-induced lack of CPU resources in the same way).

Furthermore, the Windows File Service Client caches the directory enumeration results if SMB2 or SMB3 is used (https://technet.microsoft.com/en-us/library/ff686200(v=ws.10).aspx ).

There is no such Cache for SMB1. Thus SMB1 Clients will send more Directory Enumeration Requests than SMB2 or SMB3 Clients (particularly if you keep the F5 key pressed).

It should now be obvious that you should use SMB2/3 versus SMB1 and ensure you leave the caches enabled if you use ABE on your servers.

As you might have realized by now, there is no easy or reliable way to predict the CPU demand of ABE. If you are developing a completely new environment you usually cannot forecast the proportion of QUERY_DIRECTORY requests in relation to the other requests or the frequency of the same.

Recommendations!
The most important recommendation I can give you is:

Do not enable ABE unless you really need to.

Let’s take the Users Home shares as an example:

Usually there is no user browsing manually through this structure, but instead the users get a mapped drive pointing to their folder. So the usability aspect does not count.  Additionally most users will know (or can find out from the Office Address book) the names or aliases of their colleagues. So there is no sensitive information to hide here.  For ease of management most home shares live in big namespace or server shares, what makes them very unfit to be used with ABE.  In many cases the user has full control (or at least write permissions) inside his own home share.  Why should I waste my CPU Cycles to filter the requests inside someone’s Home Share?

Considering all those points, I would be intrigued to learn about a telling argument to enable ABE on User Home Shares or Roaming Profile Shares.  Please sound off in the comments.

If you have a data structure where you really need to enable ABE, your file service concept needs to facilitate these four requirements:

You need Scalability.

You need the ability to increase the number of CPUs doing the ABE calculations in order to react to increasing numbers (directory sizes, number of clients, usage frequency) and thus performance demand.

The easiest way to achieve this is to do ABE Filtering exclusively in DFS Domain Namespaces and not on the Fileservers.

By that you can add easily more CPUs by just adding further Namespace Servers in the sites where they are required.

Also keep in mind, that you should have some redundancy and that another server might not be able to take the full additional load of a failing server on top of its own load.

You need small chunks

The number of objects that ABE needs to check for each calculation is the single most important factor for the performance requirement.

Instead of having a single big 10,000 link namespace (same applies to directories on file servers) build 10 smaller 1,000 link-namespaces and combine them into a DFS Cascade.

By that ABE just needs to filter 1,000 objects for every request.

Just re-do the example calculation above with 250ms, 100ms, 50ms or even less.

You will notice that you are suddenly able to reach very decent numbers in terms of Requests/per Second.

The other nice side effect is, that you will do less calculations, as the user will usually follow only one branch in the directory tree, and is thus not causing ABE calculations for the other branches.

You need Separation of Workloads.

Having your SQL Server run on the same machine as your ABE Server can cause a lack of Performance for both workloads.

Having ABE run on you Domain Controller exposes your Domain Controller Role to the risk of being starved of CPU Cycles and thus not facilitating Domain Logons anymore.

You need to test and monitor your performance

In many cases you are deploying a new file service concept into an existing environment.

Thus you can get some numbers regarding QUERY_DIRECTORY requests, from the existing DFS / Fileservers.

Build up your Namespace / Shares as you envisioned and use the File Server Capacity Tool (https://msdn.microsoft.com/en-us/library/windows/hardware/dn567658(v=vs.85).aspx ) to simulate the expected load against it.

Monitor the SMB Service Response Times, the Processor utilization and Queue length and the feel on the client while browsing through the structures.

This should give you an idea on how many servers you will need, and if it is required to go for a slimmer design of the data structures.

Keep monitoring those values through the lifecycle of your file server deployment in order to scale up in time.

Any deployment of new software, clients or the normal increase in data structure size could throw off your initial calculations and test results.

This point should imho be outlined very clearly in any concept documentation.

This concludes the first part of this Blog Series.

I hope you found it worthwhile and got an understanding how to successfully design a File service with ABE.

Now to round off your knowledge, or if you need to troubleshoot a Performance Issue on an ABE-enabled Server, I strongly encourage you to read the second part of this Blog Series. This post will be updated as soon as it’s live.

With best regards,

Hubert