With enterprise software solutions, whether in the IDA space or elsewhere, an interesting set of things happens as a new solution first hits production. The persons or teams involved feel a certain element of satisfaction at 8AM on production day 1, and then about 5 minutes later this feeling is quickly squelched by a sense of impending doom – “Oh my gosh I hope it actually works !”. Experience has shown there are some key issues and requirements in keeping the solution up and running so that it can deliver actual value. For IDA workloads -- ILM, ADFS, RMS and others -- your concerns around the management and operations of the solution will include many of the following:
- High availability – often implemented with failover clustering, load balanced clustering, warm standby servers, and log shipping.
- Load balancing – while also a failover mechanism load balancing will allow a front end web service to scale out across multiple servers.
- Application patching – as determined or required by the application vendor, this activity may or may not require planned downtime
- Platform maintenance, service packs, patches – maintenance of the underlying OS platform, such as Windows Server 2008
- Performance and Health monitoring – performance monitoring may include detecting bottlenecks in CPU, memory disk or network. Health monitoring is aimed at detecting problems early in order to maintain availability
- Capacity planning – a key challenge (and often overlooked or ) is the need to forecast future loads on the system and evaluate potential growth plans.
- Service Level Agreements – what is the expected uptime / reliability of the solution, how will outages be handled, what is the notification process for planned downtime or system changes, etc.
- Server Virtualization – virtualization is getting an unbelievable amount of press these days but suffice to say, it is highly effective as both a deployment and a manageability tool. Microsoft believes that manageability of the virtualized infrastructure - across Hyper-V, VMWare and other components - is the #1 key to realizing full benefits of this capability. To this end the System Center Virtual Machine Manager provides the management orchestration needed to fully harness the power of virtualization.
The above list is fairly long and each bullet could be expanded to 100+ pages, but this is not meant as a scare tactic. In this posting we’ll examine some issues and opportunities for manageability in deploying Identity and Access Management solutions.
In looking at manageability requirements we see varying concerns based on the nature of the solution. Some particulars:
ILM 2007 / FIM 2010 - typical ILM installations today have several tiers involved (FIM 2010 will have a lot more moving parts ). By the way if FIM 2010 is an unknown term to you are talking about the next release of ILM – Forefront Identity Manager 2010. Along with the MIIS Sync Engine server and the SQL Server instance, many enterprises are relying on password sync. In terms of availability needs, the metadirectory sync engine can usually tolerate several minutes or even hours of downtime since it’s a state-based engine. However if your solution requires password synchronization we have a much higher need for availability (consider it as must run 24 x7). Performance is often a key concern for ILM solutions as the sync engine often crunches through some very large data sets during its regular run cycle. And of course in certificate management solutions we must manage the ILM parts in coordination with the PKI infrastructure (see below for more about PKI). More info on ILM/FIM here.
Active Directory Federation Services – ADFS V1 is relatively easy to set up in a high availability design, and given that its primarily just a web service rarelty has performance or throughput problems. Since each ADFS server can only attach to a single ADFS forest, the number of server nodes in your ADFS infrastructure could become fairly large. Also each node – the Federation Server, the ADFS proxy, and the web application servers each will usually need to be in a failover or load balanced cluster. ADFS “Geneva” of course has more moving parts with SQL Server required, and potentially multiple authorization stores. More info on ADFS / ADFS V2 here.
Active Directory Rights Management Services – for RMS your manageability concerns will be similar to that of ADFS, as it is also primarily a web service. RMS also requires SQL Server. RMS will need continuous access to SQL Server for logging but generally does not put a large performance load on SQL. In large enterprises RMS may also require secondary licensing server clusters, and frequently requires some regular update of desktop based components, such as the RMS XML templates. More info about RMS is available here.
Public Key Infrastructure (PKI) – lots of moving parts here, from Certificate Authority servers to web servers, the underlying AD and Group Policy, etc. As with many of these complex solutions, the correlation of events will be a key challenge in monitoring and managing the infrastructure.
People, Process and Technology – as a longtime advocate and practitioner of MSF and MOF, this is one of those golden triangles. Successfully managing and operating the solution will require roughly equal parts People, Process and Technology. This gets us into the whole area of MOF and ITIL. We strongly recommend system designers to have a working knowledge of operations. Microsoft has some great documents covering MOF 4.0 here. One of the interesting developments in the coming year will be the anticipated release of Microsoft Service Manager. Service Manager will be capable of automating MOF-based processes such as Change Management and Incident Management.
The remainder of this posting really focuses on technology issues (hooray!) but we clearly recognize the importance of “soft skills” in deploying and managing any enterprise IT solution.
Management Tools for Identity & Access Management solutions
Now that we have discussed some of the business and technical requirements, let’s talk about ways to get started with a management tools approach.
It All Starts with the Operating System
In some cases a centralized management tool is not available or doesn’t quite fit the scenario. A good starting point is to consider that Windows Server 2008 is by itself an extremely manageable operating system. Features built in to the OS include WMI, PowerShell scripting, Task Scheduler, Event Forwarding and Collection, and many others. The Event Collection Service built into Windows Server 2008 is an extremely powerful capability which every system administrator should be aware of. Some additional details are provided by Otto Helweg of Microsoft’s WinCAT team in a recent posting. This article points us to a plug-in for Windows Server 2003 and WinXP to support event forwarding/collection.
Microsoft’s approach to manageability incorporates the WS-Management protocol, allowing management information to be transmitted using Web Services protocols. The set of standards for WS-Management are based in earlier work tagged as WBEM – Web Based Enterprise Management. Recently Microsoft has worked in coordination with the Open Pegasus group to enable a more streamlined approach to cross-platform management. SystemCenter Operations Manager 2007 R2 (What’s new in R2) adds operational support for Linux/Unix systems.
In recent years Microsoft has evolved its systems management strategy to focus on an integrated set of products under the SystemCenter branding. In managing the data center we have SystemCenter Operations Manager (SCOM), Virtual Machine Manager (SCVMM) and other great tools. SCOM (formerly known as MOM – MS Operations Manager) has developed into an extremely flexible and capable tool for monitoring distributed systems. SCOM is built around a very elegant architecture known as the model-based database, and the SCOM schema and tools can be easily extended through Management Packs. Microsoft’s Common Engineering Criteria for Windows Server strongly recommends that all server-based applications provide a built-in management pack. Microsoft has built out the industry-wide SystemCenter Alliance to encourage other vendors to supply management packs for their products. In the section below we’ll examine the capabilities of SCOM management packs that can support IDA solutions.
Management packs are XML documents which provide a structure to monitor specific hardware or software. These Management Packs describe a hierarchical health model for the underlying application, and its very important to understand the health model of a distributed system before it can be fully managed.
An example health model (for SQL Server 2005/2008) is shown below. We can see that a failure at any point in the chain can result in a loss of functionality for the users. This capability is provided by Aggregate Monitors, these are objects which monitor the health of all the items in a hierarchy, and if any item goes “red” the entire service will be shown at a Warning or a Failure state.
SCOM ships with a core set of management packs, and also supports addition of MPs thru the SCOM Management Pack Catalog – this is truly a Plug & Play environment and highly extensible. Management packs of interest to IDA designers and administrators include:
- ILM - The ILM MP is here.; also there is a separate MP for PCNS. The ILM Management pack is actually labeled as the MIIS 2003 management pack and the MP does work with ILM’s latest version (ILM 2007 FP1). The pack was originally developed for Ops Mgr 2000 and 2005 so it will not take advantage of the latest features of SCOM 2007. As SCOM allows management packs to be extended it is possible to implement additional functionality such as recovery tasks. A how-to on extending management packs is described here.
- The FIM 2010 MP is expected to provide monitors for the FIM Service, the FIM Portal, the CLM Portal and the Sync Service. The monitors will support opt-in to control verbosity.
- Active Directory Federation Services – The ADFS management pack is sli (here). There are separate management packs for each variant of ADFS – ADFS v1.0 in Windows Server 2003 R2, or ADFS V1.1 in Windows Server 2008 (LINK).
- Rights Management Services – The Microsoft Windows Server 2008 Active Directory Rights Management Services (ADRMS) Management Pack (link) monitors the performance and availability of the Windows Server 2008 version of ADRMS 2.0. By detecting, alerting on, and automatically responding to critical events and performance indicators, this Management Pack helps indicate, correct, and prevent possible RMS related service outages.
- PKI – The Certificate Services Management Pack is found in the MP Catalog, a shortcut is here. This management pack was released for MOM 2005 but can be uplifted to SCOM 2007.
- SQL Server – There are separate management pack components for each version of SQL Server. Each of them relies on the SQL Server Core Library management pack. The management pack for SQL Server provides a very rich set of capabilities.
- Windows Server –you will definitely want to leverage the Windows Server 2003/2008 management pack to provide monitoring of the base OS. Technet provides extensive information on monitoring the base OS as well as many of the Server Roles and related components, a good starting point is here. This core OS management pack provides the means to monitor system resources such as disks, network, CPU and memory.
- Active Directory – as AD is crucial to most IDA scenarios we must ensure strong management and operations. The AD management pack guide provides a wealth of information for system administrators. Using this management pack SCOM will automatically discover and monitor all domain controllers, global catalogs, sites, forests, site links and connection objects. The AD health model is shown in the figure below.
Other related management packs cover AD, Application Server (IIS), Cluster Server, and others. Also Quest Software has delivered a management pack for Active Directory Application Mode (ADAM), now known as AD LDS .
One of the frequent issues with using monitoring tools is they will generate a lot of alerts, sometimes way too many alerts. The tuning process is an important part of any SCOM implementation and will allow the administrators to set rules as to how server events are interpreted and resolved. It may be necessary to disable certain object monitors or modify threshold levels where alerts are triggered. This topic really proves out the value of an enterprise management tool such as SCOM, since a raw stream of event and performance counters is difficult to manage otherwise.
If you’re rolling out a new SCOM infrastructure, or upgrading, be sure to consult with a system management and operations expert. The product is fairly complex to implement and requires some careful planning.
Ensure you have defined a service level agreement (SLA) for your application, and this SLA has been communicated to all stakeholders. Ensure the operational plans for your solution include the range of activities from end users calling the help desk, to server administrators dealing with hardware failures and the like.
If you are interested in keeping current on the topics around manageability you’ll want to frequently visit the SystemCenter Blog, this will help you to track down a plethora of additional information covering the IT industry and Microsoft’s progress in the manageability space.