Distributed Availability Groups Setup and Monitoring

Overview

First, we need to cover what distributed availability groups are.  The basic definition is it an availability group of availability groups (AG). No remembering back to English class in school you were not allowed to use words that were part of the word to define the word so that makes describing this a bit tricky.  But there is no way to do that so you would have an AG on one set of servers that you want to replication to set up as an AG on another set of servers. Now, why would we want to do this?  There are a couple of use cases I can think of.  One is if you want to upgrade versions of Windows or SQL Server and maintain high availability.  Another is if you want to have a separate site set up for disaster recovery that could be the distributed AG.  Another is you need to have a read-only copy of a database from highly available server available on another server that is highly available.

Set up

Setting up an AG can be quite tricky and even with Microsoft support you may run into issues depending on the version of SQL Server in my case I was using SQL Server 2016 SP1 CU4 and 2017 RTM.  All the steps are outlined in Microsoft Docs here. A couple tricks I had to use to get them set up that are not mentioned in the article involved with SQL Server 2016 SP1 CU4.  One was it would not direct seed to the second AG primary server without trace flag 9657 which compresses the direct seeding (on 2017 RTM I actually could not set it up with the trace flag on).  Second on SQL Server 2016 SP1 CU4, on the second AG secondary server, I had to perform an extra step that is not in the documentation which was to join the database to the availability group (note I did not have to do this on SQL Server 2017 RTM).  So if you notice on the second AG’s secondary that the database does not join you need to perform this extra step that is not in the documentation that Microsoft provides.

 

--Wait for it to be only in restoring state, not sychronizing and restoring
USE master
GO
ALTER DATABASE [database] SET HADR AVAILABILITY GROUP = [secondaryag];

 

Monitoring

Monitoring can be tricky with distributed AGs because of the way it shows up in SSMS.  The distributed AG created on the primary AG does not have the ability to show you a dashboard to monitor traffic like the typical AG and the secondary replicas do not even show the distributed AG.  So, to monitor the distributed AG you are best to head to the DMVs.  DMV sys.availability_groups has the is_distributed column that will allow you to only see the distributed AGs.  Below is a query adapted from the Microsoft Docs query to add additional information to check on the status of the distributed AGs which you must run on the primary AG.

 

SELECT ag.name as 'Distributed AG', 
	ar.replica_server_name as 'AG', 
	dbs.name as 'Database', 
	ars.role_desc, 
	drs.synchronization_health_desc, 
	drs.log_send_queue_size, 
	drs.log_send_rate, 
	drs.redo_queue_size, 
	drs.redo_rate,
	drs.suspend_reason_desc,
	drs.last_sent_time,
	drs.last_received_time,
	drs.last_hardened_time,
	drs.last_redone_time,
	drs.last_commit_time,
	drs.secondary_lag_seconds
FROM sys.databases dbs 
    INNER JOIN sys.dm_hadr_database_replica_states drs on dbs.database_id = drs.database_id
	INNER JOIN sys.availability_groups ag ON drs.group_id = ag.group_id
	INNER JOIN sys.dm_hadr_availability_replica_states ars ON ars.replica_id = drs.replica_id
    INNER JOIN sys.availability_replicas ar ON ar.replica_id = ars.replica_id
WHERE ag.is_distributed = 1

 

Summary

While setting up the distributed AG can be challenging it does have its benefits as we outlined above.  Monitoring the distributed AG can be trickier because now you have to monitor the distributed AG on the primary AG then monitor the availability group on the secondary AG.  The DMVs you use to monitor the status of the distributed AG are the same as the ones for the regular AG.

Related Posts

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.