We are thrilled to have special guest authors from the Canadian MVP Award Program contributing posts around their favourite tips, tricks & features of SQL 2014. For the next few weeks, we will be posting a different article from one of our Canadian SQL Server MVPs each week. We hope you enjoy them, please feel free to leave a comment/ Thanks to Michael Defehr for this week's article!
I sometimes wonder if there is a disconnect between what companies are hiring and what they end up getting, or what people are being hired for but then end up doing. The primary job of a DBA is to protect the organization’s data, and to make sure that it is available to those who need access to it while ensuring that it is inaccessible to those who are not authorized to access it.
Protecting the data involves making sure that it is recoverable in the event of an unforeseen disaster. This probably involves performing some form of backup and making sure that said backup is taken often enough and is moved offsite such that if there is a fire, flood, meteor, terrorist attack or somebody accidentally re-provisions the exact wrong SAN LUN, the data can be retrieved, restored and/or otherwise brought back online. Since this is hopefully a very rare occurrence, but can be very complicated depending on the various systems involved, it is an absolutely necessary part of a DBA’s job to practice the bare-metal restore process – whatever that may mean in your organization. It involves coordinating with management on understanding and setting expectations on how long it could take and how much data loss there might be in at least a few different scenarios. It probably involves coordinating with IT guys on what data is where, which backups are moved/copied/transferred where and when. As an aside, I sometimes see some of these details more or less “dictated” by IT groups – “the third party vendor says their product takes care of the database, so please stop asking questions” and I wonder if enough consideration is truly being given to some of the unique challenges of database recovery, and what would actually happen in the event of a true disaster.
Making sure the data is available involves both performance and ensuring that the server continues to function in the event of isolated, predictable failures. Performance is a huge topic and can involve everything from specifying specialized hardware – particularly IO subsystems – that are suitable for database servers to tuning queries, splitting analysis workloads from transaction processing workloads and such. This particular function could easily have an article of its own, and a typical DBA probably spends a larger-than-average proportion of their time on performance. The only other thing I’ll mention here about performance is that database server hardware often sparks another turf war between IT groups and DBA groups. IT groups sometimes have very strict standardized, prescribed ways of provisioning servers which may not work very well for database servers. Depending on the situation, database servers often have to make several exceptions to these rules, but ascertaining when these exceptions are necessary and navigating the process of actually making them can be a challenge. At first glance, handling predictable failures might sound a lot like protecting the data – discussed above, but the difference is in the predictability of the failure – these are things that *probably will* happen, even though with today’s modern, well-built hardware, they too are quite rare. The most classic example is mirroring a hard disk. Besides cooling fans, hard drives are pretty much the only computer components with moving parts which makes them far more likely than any other component to fail – but they also store your data and are the component you least want to fail – that’s why in pretty much any machine that calls itself a “server”, there are redundant hard drives (at least). Handling predictable failures is about minimizing interruptions when common things happen, while protecting the data is about ensuring that you *can* recover no matter what happens. This is my take on the difference between High Availability (HA) and Disaster Recovery (DR) and due to their similarity, they are often discussed together (HA/DR).
Finally, ensuring that the data is inaccessible to those who are unauthorized is all about security. This is where DBAs get their stodgy reputation – sometimes telling you that you can’t access your own data and stuff like that. This function involves classic access control, managing what the application can do vs. what the user can do, thwarting SQL injection attacks, complying with industry security standards, encrypting the data at rest and sometimes even making sure that the server is simply in a locked room.
These are the broad strokes, but does this sound like what you do? Often I find that, other than a fair amount of query tuning (which is a big topic, but was barely mentioned here), the “DBA” is *not* doing this stuff, and I’ve heard the line “well he’s more of a *development* DBA than really a DBA” more than once – which is a bit like saying “well that is more of an apple orange than really an orange…”. My question is often “OK, if he is a database developer, then who is the DBA?” More important than titles and role definitions is: “who is doing this stuff, an do they know what they are doing?”. It is all too easy to overlook some of this stuff, because until or unless something happens, there are no squeaky wheels.
Much of database administration work is a lot like insurance – it is an expense – plain and simple. Only someone who is really good at selling insurance can pull off talking about it in terms of “value” – that is, unless something happens. Basically, you buy insurance either because you have to, or because you are afraid of being ruined if something happens. The fact is, you are probably more likely to have a catastrophic failure in your database server than having your building burn down, but I’ll bet your building is insured. Is your database safe? Resorting to scare tactics is a really detestable way to sell something in my opinion, but companies do fail because their database server breaks – it can be that important. Even if your organization is thinking about doing things like running DR tests, you are often faced with a choice: we really need to run a DR test today, but this report is throwing an error, and the VP of finance really wants it fixed… DBAs are usually pretty smart folks – it would be a shame to “waste” all that talent running tests and drills preparing for something that probably won’t happen, right?
Is it possible to go to the opposite extreme? I have heard of companies that provide their DBAs with satellite phones, are aware of which bridges each one needs to cross in order to get to the data center and stock MRE-type food in the datacenter… This of course begs the question that if an event has taken out multiple bridges in your city, has taken out the communication network and impeded your ability to acquire food in your city, then how important is it to keep your application running? It does make some sense in light of isolated disasters when your application is hosted out to the whole country or the world, and what makes sense for your application will of course vary from what makes sense for others. Rest assured, though, if you do decide to put some effort into some of this stuff, you are unlikely to be on the extreme end of the spectrum.
At the end of the day, you simply need to understand who in your organization is responsible for this stuff. Have you been hired as a DBA? Is some of this stuff in your job description but you rarely end up actually doing a lot of this? Have you hired a DBA and think your data is safe but keep him/her busy fixing reports and supporting development? Food for thought, I hope.
Michael DeFehr is an independent SQL Server consultant doing business throught MDD Solutions Ltd. based in Winnipeg, Manitoba. His career has focused on database administration and development using Microsoft SQL Server. A Professional Engineer and Microsoft Certified Information Technology Professional in both Database Development and Administration, Mike has spent the past 15 years producing and maintaining database solutions. Mike specializes in the relational and storage engines and his solutions emphasize performance and simplicity. Mike is the chapter president of the Winnipeg SQL Server Users Group