Performance and Scalability

I have been reviewing the Microsoft Pattern and Practises group new PAG Guide - Improving .NET Application Performance and Scalability and struggling a bit. Firstly they talk about understanding the performance and scalability requirements and then testing continuously to meet those requirements which I totally agree with however they are a bit vague about what the requirements should look like and how to test. Performance and Scalability requirements have all been very well spec’d by the benchmarking people so there is a lot of good practice there. Additionally the whole area of stress testing is well understood and documented by companies like Mercury and Quest so again there is a lot of information available.

However that is not why I am confused; the issue I have is that there are a large number of different techniques and technologies covered by the guide (which is good) but there is no structure as to when you should apply them or indeed how to think about them holistically. I always have problems with long lists (bad memory!) so need a structure to work from. Thinking about how I go about performance tuning I tend to think about two top level buckets; Architectural issues and Design issues. Architectural issues are all around structure whilst design issues are all around resource management. Typically the architectural issues should be thought about first in the design cycle as they are both further reaching in terms of performance and scalability impact and more difficult (if not impossible) to refactor. It should also be noted that there is a lot less architectural skill and guidance available than in the design space. Looking at these two areas in more detail:

Architecture

This is all about how to partition and structure an application, where functionality exists and how it communicates. It is all about bandwidth, latency, granularity, coupling, state (location), layers, affinity etc. Typically it surfaces as questions such as “When should I use remoting as opposed to web services?”). I have always used a general rule about structure (I call it Platt’s first law) which is:

The path length through the thread / proc / component / service / etc should be in the same order as the latency through the communications stack between threads / components / services etc

So threads can be very short path length (fine grained) because there is a minimal latency between thread switching, services should be very coarse grained because they have to go through a huge comms stack.

Needless to say there are always exceptions so blind application of any general purpose rule will always be wrong (Platt’s second law!).

Design

Design issues always seem to be about resource management (and the best book covering the general principles is still Jim Grey’s IMHO). This can be split into read resources and update resources (eg transactions) which is where Jim’s book really shines (it’s a bit weak on caching). The sort of resources that need to be thought about are cpu, threads, locks, transactions, data, modules, applications, networks etc. many of these are of course monitored by perfmon although you always need to be careful of Heisenberg who also seems to work in the software world. Whilst monitors are very helpful resource issues can be very difficult to locate when they interact with one another.

Anyway it strikes me that when thinking about performance and scalability there should be different types of analysis applied at the different stages of the architecting and design of an application and these should look at different elements of scalability.