Are You Aware Of Who Is Sneaking Behind Your SQL Server Back(ups)?

Pop quiz: You manage a production database and you need a copy of your backups to restore them on a development environment for application testing or performance tuning. You need to do this on a regular basis. Do you

  1. Take a backup of the database and restore it on your development environment?
  2. Configure replication between your production and development environment, using the production database as the publisher and development environment as a subscriber?
  3. Configure database mirroring or log shipping between your production and development environment?
  4. None of the above

While this isn’t intended to emulate a Microsoft certification exam nor a DBA technical interview question, the situation happens on a regular basis, particularly in large organizations. My first exposure to this scenario happened more than a decade ago when I was still working as a data center engineer. We have four (4) environments for every mission-critical application that we host – UAT/development, staging, production and DR.  We do a data refresh every month – we restore a copy of the production databases to the UAT environment for application testing before the code gets promoted to production. Part of the data copy process is to scramble and obfuscate the data for security and compliance purposes. And this was back in the good-old SQL Server 2000 days.

You might be wondering how I would answer the question above.  The current “me” would always answer “it depends.” Which is then followed-up by several other questions like, “what is the current backup strategy like?” or “what is the current DR strategy like?” and other similar ones. The old “me” did none of the above. You might be wondering, “so, how did you even meet the requirement when you did none of the above?” Notice that I only answered none of the above, I didn’t say I did nothing. 🙂

Because we already have a backup and a DR strategy in place, I no longer need to take a backup of the production database just to restore it on the UAT environment. I already have one taken daily, I might as well use that. I’m a big fan of maximizing what I already have to accomplish the tasks that I need. The existing copy of the backup taken as part of the backup strategy is what I use to do the data refresh. In fact, what I did to accomplish this task was I wrote a VBScript to get the latest database backup and restored it on the UAT environment.

Who Is Sneaking Behind Your Back(ups)?

In a previous blog post, I mentioned why having successful backups creates a false sense of security. But here’s one thing that’s even more risky: somebody else taking a backup of your databases without you knowing. You might be wondering, “how can this be risky when you know you have a backup that was even tested by restoring on a UAT environment?” Let me explain.

You create your backup strategies based on your recovery objectives (RPO/RTO) and service level agreements (SLAs.)  If your databases have hit the terabyte-size mark, you might want to create a weekly FULL database backup and a daily DIFFERENTIAL backup with LOG backups happening every 15 minutes. Every backup you take will depend on the latest FULL database backup. Which means that in order to properly restore your database using the backup strategy above, you would need to restore the latest  FULL database backup, the latest DIFFERENTIAL backup and the series of LOG backups that occurred after the latest DIFFERENTIAL backup.

Now, let’s go back to the scenario above. You need a copy of the database restored on the UAT environment. If you take a backup outside of your existing backup strategy, you’ve simply introduced a break in the {log sequence chain} from your point-of-view. What I mean by this is that you still believe that your restore strategy follows the existing backup strategy. If, for example, your backups are stored on the D:\ drive, your first instinct to restore the database would be to look at the D:\ drive, sort the backup files in terms of the Date Modified column in Windows Explorer and use that as a reference point for the restore sequence – without ever thinking that something could have happened in between backups.

A common scenario that I encounter is this: the developers take a backup of the production database, copy/move that backup to the UAT environment and restore the backup. And guess what they’ll do with that backup after they have done the restore? They would simply hit the Shift + DELETE button on that file so as not to take up space on the UAT environment. They may have verified that the backup was successfully restored but they have now deleted the latest FULL database backup that all of the other backups depend on. While there has not been a break in the log sequence chain as far as SQL Server is concerned, your restore sequence will no longer work because of this “missing” backup.

Now, you might argue, “I don’t let anyone – not even developers – take ad-hoc backups from my databases.” To which I would respond, “the only way to find out is to check.” I’ve even had an experience where the developers included a “backup database” feature on the application so that they can bypass the DBAs and take their own backups. Note that anybody who has the sysadmin server role and the db_owner and db_backupoperator database roles can run the BACKUP DATABASE/LOG commands.

Below is a script that I use on a regular basis to check for ad-hoc backups taken from the databases. It is simply reading the backup history tables from the msdb database. However, I pay close attention to the physical_device_name column to see if there is anything unusual in the results.

SELECT database_name,Backup_type=
CASE  TYPE
WHEN
'D' THEN 'FULL'
WHEN 'I' THEN 'DIFFERENTIAL'
WHEN 'L' THEN 'LOG'
WHEN 'F' THEN 'FILE/FILEGROUP'
WHEN 'G' THEN 'DIFFERENTIAL FILE'
WHEN 'P' THEN 'PARTIAL'
WHEN 'Q' THEN 'DIFFERENTIAL PARTIAL'
END
, physical_device_name,backup_start_date, backup_size/1073741824 AS backup_size_GB

FROM msdb.dbo.backupset a INNER JOIN msdb.dbo.backupmediafamily b
ON a.media_set_id = b.media_set_id
WHERE database_name=‘TestDB3’

Here’s a screenshot of the results taken from my test environment. 

backupHistory

 

Notice that the first and third backups use a file name generated by a database maintenance plan (or a script that dynamically generates file names) while the second one uses a simple file name and is on a different location. My guess is that someone took that backup outside of the existing backup strategy. Which means that the restore sequence that requires the DIFFERENTIAL backup now depends on the second FULL database backup instead of the first one.

Call To Action

Include the script above in your daily monitoring and reporting utilities to check if ad-hoc backups are being taken without you knowing. You don’t want to be caught off guard and restore your databases from backups that you thought would “work as expected.” If you see an ad-hoc backup being taken, ask questions. Why was it taken? What is it for? Is it necessary? Who took the backup? Who authorized it? You also need to educate other members of your team – developers, backup engineers, IT admins, etc. – about copy-only backups. I didn’t have this feature back in SQL Server 2000 so I just used a copy of the latest FULL database backup instead. But when it was introduced in SQL Server 2005, I started educating our operations engineers about the COPY_ONLY keyword so they can take copy-only backups.

Please note: I reserve the right to delete comments that are offensive or off-topic.

Leave a Reply

Your email address will not be published. Required fields are marked *