Parsian Gulf

It’s time to talk a little about body parsing. One of IAG and UAGs key functions are parsing the body of files it delivers to connecting clients. The filter parses the files, and looks for various HTML and Script links within them, and “signs” them by adding the unique string of characters that you have probably seen before (for example, https://www.contoso.com/whalecom45c76f7678a87d876ca9096a6c/whalecom0/index.html). In case you are not familiar with the signing process at all, the purpose of this is to allow the server to expose multiple internal servers through a single IP and Port. Each internal server gets a unique signature, and when a request arrives from the client, the IAG server looks up the signature, and knows to which internal server to forward the request to. You can think of it like Valet-tickets, which lets the drivers know which car to bring around when you’re done eating.

The IAG server has an engine called “SRA”, which performs this magic by loading each file into a special buffer in memory, and searching through it for various HTML and JavaScript tags. This introduces several challenges that you may have run into along the way.

A common issue with file parsing is that the buffer allocated in memory for it is limited. Normally, IAG is supposed to parse text files like HTML, ASP and JS, and these files are usually quite small – hardly more than a few hundred kilo-byte. Sometimes, an unusually large file needs to be delivered to the client. For example, if a user downloads a large text file from a SharePoint site, or a large attachment from an Email message in OWA. We’ve also seen cases where software that generates usage reports creates very large HTMLs. If the file is larger than the default buffer size (which is 10 MB), the buffer fills up and the server has an error. In older versions of IAG, before SP2U3 (https://blogs.technet.com/ben/archive/2010/03/08/it-s-that-time-of-the-year-again.aspx), this would result in a generic and unintelligible 500 error, but following Update 3, it sends a clear message to the Web Monitor.

So…what if you need to have larger files go through the server? Well, there are several options:

1. Option one: Increase the buffer size. This is discussed in detail in the Update-4 for IAG SP1 (https://support.microsoft.com/kb/955123), but basically, it involves adding a registry value with a larger buffer allocation. This is the procedure:

a. Using the Registry Editor, navigate to:

HKEY_LOCAL_MACHINE\SOFTWARE\WhaleCom\e-Gap\von\UrlFilter

b. Create a new DWORD value and name it MaxBodyBufferSize

c. Edit the value to the maximum file size you want to support, in bytes. For example, to allow 20 MB files through, enter a value of 20000000 decimal.

d. Close the registry editor

e. Activate the UAG configuration (otherwise, the new settings will revert after a server reboot)

f. Restart IIS (type IISRESET in a CMD window, or reboot the server)

One must keep in mind, though, that the buffer is located in the computer’s memory, and if many users are connecting, it could use up a lot of memory even if no large files are being downloaded. Microsoft recommends setting this value to the lowest possible value.

2. Option two: Skip the parsing of some files. This option has a special GUI element that allows you to eliminate the parsing of specific servers, and/or specific URLs. This is controlled via the Advanced Trunk Configuration. To configure it, follow these steps:

a. For the relevant trunk, go to Advanced Trunk Configuration.

b. Switch to the “Application Access Portal” tab.

c. Click on EDIT next to “Don’t parse the bodies of these requests”.

d. Click ADD to add a server:

- The server name is the INTERNAL name – the name the IAG server would use to contact the server (and not the public portal URL)

- The server name can be specified using RegEx. For example Domino.* would affect all servers that start with the word Domino, so can cover an entire farm of servers.

- If the server name contains characters that are considered “non literals” for RegEx, they need to be slashed-out. For example, a dot (.) is a non-literal, so the server name www.contoso.com would have to be specified as www\.contoso.\com

e. Click ADD at the bottom of the tab, to add URLs:

- The URL is also internal, so should not include the Whale Signature.

- The URL can also use RegEx, so you could use something like /docs/.* to cause IAG to skip all files read from the docs library of some server.

- The above also mean that one needs to be careful when writing the URLs, to make sure they are not being missed because they contain non-literals. For a complete reference to RegEx, refer to the IAG Advanced User Guide Appendix B.

clip_image002

3. Option three: Skip parsing based on content type. This option is suitable if you want to apply or block parsing based on the content type of the file. For example, you might feel that all TEXT files should be parsed, but JavaScript should not. By default, the server is configured to parse these content types:

· Text/.*

· Application/x-javascript.*

· Application/x-vermeer-rpc

· Application/x-ica

To change this behavior, go to Advanced Trunk Configuration and Switch to the “application customization” tab. Under the “search and replace on content-type”, add, remove or edit the default content types and configure it to your liking. Please note that “content type” is not exactly like file-extension, so make sure in advance that you know exactly what is the content-type of the files you want to control.

clip_image004

4. Option four: Skip parsing based on the application type. This is suitable if you know that a specific application should be completely skipped for body parsing. This is not a popular option, so I won’t detail it here. Refer to TechNet if you want more details: https://technet.microsoft.com/en-us/library/dd278134.aspx