Delayed Durability in SQL Server 2014 and How You Can Take Advantage of It

There have been a lot of blog posts and articles written about the Delayed Durability feature in SQL Server 2014. Here are a couple of them to get you started.

As a SQL Server DBA, this is one feature that you should consider implementing in your environment – with the proper testing, of course – if you have a transaction log bottleneck. What this means is that your top waits include WRITELOG, some PAGEIOLATCH_* and potentially LOGBUFFER waits. Check to see if your system has a transaction log bottleneck by taking a look at the top waits.

I couldn’t find any resources about whether or not SharePoint can take advantage of this feature or if it is even supported at all. As I’ve written in a previous blog post, SharePoint uses multiple databases and transactional consistency is maintained by keeping all of those databases in-sync (or at least as close to being synchronized as possible.) I guess I’ll just have to wait for an official support documentation or KB article.

The one caveat that you need to consider with the delayed durability feature in SQL Server 2014 is this: you want to trade transactional durability for log performance.

Now, as a SQL Server high availability and disaster recovery guy, I’m all for transactional durability. It’s what we’re all accustomed to and it’s a guarantee that transactions are durable when committed. In a highly transactional, mission-critical system, every transaction counts and making them durable is a way to meet recovery objectives in terms of data loss. But why do I think your application would benefit from it?

Understanding the workload profile of your database can help you decide if you can take advantage of delayed durability.  Also, a lot of the SQL Server deployments that I see in the field are now on virtualized environments, utilizing storage area networks (SAN) and even iSCSI. This means that the virtual machines will share a common storage platform which impacts I/O throughput not just for your databases but for all workloads running on the SAN. This means that your databases will potentially experience I/O latencies – be it the data file or the transaction log file.  While we can apply all of the storage best practices configuration settings to optimize database I/O performance, there’s only so much we can do with the limited amount of resources that we have. Implementing delayed durability would definitely help improve transaction log performance.

But what about potential data loss? Your transactions are not durable when committed. Scary, isn’t it?

This is one of the main reasons why I emphasize the importance of recovery objectives (RPO/RTO) and service level agreements (SLAs.) You cannot treat a mission-critical application that cannot afford data loss the same as a line-of-business application that only needs to be online 16 hours every day. It’s the same reason why it’s OK to use SIMPLE recovery model with some of your databases. If you understand your workload profile and your recovery objectives, you can categorize your databases according to criticality. And if your not-so-mission-critical databases are experiencing transaction log bottleneck, every bit of performance optimization helps, including delayed durability.

If I were to decide, I would immediately turn this feature on for Tier 1 and Tier 2 SQL Server 2014 database servers and measure the performance impact. Anything beyond Tier 2, I would carefully evaluate the potential data loss and discuss it with business stakeholders.

Are you in the process of upgrading to SQL Server 2014? This is a great opportunity to test if your workloads can take advantage of this feature.

Please note: I reserve the right to delete comments that are offensive or off-topic.

Leave a Reply

Your email address will not be published. Required fields are marked *