Several weeks ago one of our joint customers with SAP complained about the capabilities of SQL Server and announced to migrate his SAP landscape to a competitive database platform if Microsoft wouldn’t fix “the problem”. The key issues, for which the database was blamed: Unacceptable performance, frequent downtime of the SAP application causing significant friction between the business and the IT organization.
Well, somebody who’s a little bit trained on isolating the potential root cause for such a scenario might ask some questions, e.g. following this generic approach:
Unfortunately, nobody was available to answer even basic technical questions: The customer has outsourced his infrastructure operations to a third party hosting provider.
Here’s what happened: Three years ago the customer had migrated his SAP landscape from Oracle/UNIX, originally running on expensive proprietary hardware, to SQL Server/Windows Server, now deployed on commodity hardware. This move yielded significant cost savings. The IT organization was then slashed, and the majority of operational tasks were outsourced – yielding further cost savings. Unfortunately that second step had two consequences which were not immediately visible: The DBA (Database Administrator) role was made redundant. And, since no infrastructure- or operations-competence remained in the customer’s IT organization, nobody was available to notice the absence of the DBA role in the Service Level Agreement (SLA) with the hosting provider. In the meantime several additional SAP applications were installed; since the initial re-platforming of the SAP system the size of the database has more than doubled, the number of concurrent users nearly tripled. The hosting provider literally met his obligations, as defined in the SLA – and nobody ever touched the database, except for back-ups.
Isn’t that a great confirmation of SQL Server’s product quality? An engine, running for nearly three years, supporting thousands of users 24×7 – without anybody being in the engine room?
The cheapest solution is never the most cost-effective solution: The opportunity cost resulting from slow application performance, plus the cost of unplanned downtime, regularly outweigh the savings from not employing a DBA. Running a business-critical application without any maintenance may be possible, but it’s definitely not a recommended practice. Although SQL Server is easier to operate and maintain than competitive databases, it still requires somebody to look after this important engine from time to time. And that exactly is the role of the DBA.
If you accept this line of thinking, than you’ll certainly scratch you head and ask yourself: Where would I find a good DBA, and what exactly are the accountabilities of that role?
Here’s a another DBMS vendor’s comprehensive overview of the DBA’s role. Microsoft’s understanding of the role goes beyond this basic description, and the importance of the role is demonstrated by the Microsoft SQL Server certifications: An MCITP has at least demonstrated that he has the basic knowledge for operating a business-critical system. Would that mean that he could actually operate such a system in the most effective way? No, not really. Additional guidance, some application specific training, plus lots of experience will be necessary. Nothing can compensate for experience, but the former two issues can be addressed by two means:
- Brad McGehee has a nice check-list and best practices for the SQL Server DBA, and regularly blogs about this topic. And this is just one example: PASS – The Professional Association for SQL Server, has much more guidance to offer.
- Microsoft offers a training specifically enabling Oracle DBAs – Practical SQL Server 2008 for the Oracle DBA – in addition to the normal SQL Server readiness courses.
But there is another question: If infrastructure operations are outsourced, how could an organization then be sure that the third party’s infrastructure staff meets the skill and capacity needs that are required when operating a business-critical application, such as SAP? How could this requirement be embedded into an SLA, in such a way that defined business availability, measurable performance characteristics, and minimum operational cost are maintained?
One alternative would be to be much more precise when defining the SLA. Unfortunately we regularly notice that most customers don’t seem to be mature enough to define SLAs precisely, and to include all necessary KPIs. A brief test for your own organization:
- What are your three most mission-critical business processes?
- Which SAP transactions are bound to these three processes, and which response times do you guarantee for each of these transactions?
- Which interfaces are connected to the three processes, and how have you defined the SLAs for these interfaces?
- Which batch jobs are related to these three business processes, and how do you guarantee that the maximum duration for these batch jobs is not exceeded? And, btw, what happens if one of these batch jobs is aborted because your batch window is closed?
The other – recommended – alternative: The DBA is not a member of the infrastructure team, the role is instead integrated into the SAP Basis team. And, additionally, the DBA is empowered to enforce infrastructure modifications, as appropriate. In customer environments where these two conditions meet we see significantly less performance issues, our unplanned outages. It’s quite obvious that the DBA then also needs to be involved in the KPI definition for the SLA. And, sometimes, it then turns out that the cheap outsourcing solution is not that cost-effective at all, at least not when the business requires the continuation of the same service level. Two more questions everybody in this business should ask himself: What’s the total cost to the business per hour downtime? And how much insurance coverage are you willing to pay for mitigating such risk? (Or, in tech lingo: Do you really want 5 nines – and not measured at the infrastructure level, but at the UI level?)
Somebody who underestimates, or even ignores, the role of an DBA in this kind of environment accepts a huge operational risk – and that doesn’t go well with business-critical, or even mission-critical systems. This risk has nothing to do with the products or technologies involved, but with governance. And it has to do a lot with compliance.