Thank you server may I have another thousand

Today I spent several hours working on an issue with ranged retrieval when I realized, many people probably don’t know what it is! Let me explain.

 

In Active Directory (AD) and Active Directory Application Mode (ADAM) some people have attributes that have *many* values in them. Thousands. Sometimes more.

 

Imagine a scenario where you are reading this attribute with all of these values out of the directory. It’s not ideal to pull them all at once every time you want to do a read of that attribute (lots of computational resources on the server to start serving up the query, lots of network bandwidth, lots of time to send the data over the wire) so we do what’s called a ranged retrieval. Upon request we serve up a range of values (in w2k this range is 1000 at a time, in w2k03/ADAM it is 1500 but that 1500 is configurable) to the LDAP client, and the client can keep fetching more values over and over again. This is similar to doing a paged search.

 

MS first implemented this, but we submitted this as a standards draft. You can find a copy of that draft here: https://www.hut.fi/cc/docs/kerberos/draft-kashi-incremental-00.txt

 

From that doc, note the way in which we request ranges:

 Client Request        Server Response 
 member;Range=0-*      member;Range=0-500 
 member;Range=501-*    member;Range=501-1000 
 member;Range=1001-*   member;Range=1000-1307 

 

As you can see, one only needs to specify the starting point of the range. You simply request startingpoint-* and you are given back MaxValRange worth of values.

 

In 2k03/ADAM, as I mentioned above, this value is configurable. Looking at my ADAM instance (using dsmgmt) I can see that the default value is:

MaxValRange                     0

 

That effectively means use the default, which is 1500.

 

People often ask us “can I change that value to something different?” The answer is of course yes, you can. In extreme cases one might be able to measure a perf hit as a result of it.

 

I typically recommend people not change this, but not for performance reasons. Sure that is a compelling reason, but that’s not my core reason. My core reason is this: if you assume that you can set this value high enough and you don’t used ranged retrieval, that implies that you always have some artificial ceiling that imposes a maximum number of values which you can fetch. More than that will be truncated. I’d much rather implement a solution which will work any data set than one that only works most of the time.