Determining Which Indexes to Compress Among Thousands

Note: Since I wrote this dbatools had come out with an easier approach that you should check out here.

Our Problem

In a shared in environment you may have hundreds of databases with the same schema but depending on the data loaded into them not all of them will benefit from being compressed so you don’t want to incur the overhead of compression on all the of the data.

Our Solution

We can write procedure check periodically rather a table will benefit from compression or not.  There are a few tricks though:

  1. If you have sparse columns you cannot compress the table, we check for that in lines 70-74.
  2. Next if is a index type of 1 (a clustered index) it cannot contain LOB data types image, ntext, or text.  Which these days shouldn’t be a problem, they have been after all deprecated, so you aren’t using them.  We check for these in lines 76-84.
  3. Finally, we perform the same test for LOB data on index type of 2 (nonclustered indexes) in lines 85-95.

If everything checks out OK we call our second procedure to calculate the percentage of space we would space and if we would save more than 25% then we create script to compression the index.  Now we have a @Debug parameter in here so it can just show us what code it would execute to determine what it will do.

NOTE: This objects are created in a DBA database if you create them in a different database you will need to edit line 99 in the first procedure to indicate which database you placed the procedures in.

Finale

Things I could be see be added is logging when an index is compressed to a table so if something happens to performance you could correlate it back to this event.  You may also not want to hard code the value of 25 percent.  Also in the first procedure you may want to have a way to tell it which databases you want it to consider for operation.  We actually have our procedure where it doesn’t do every database just the once that have identical schemas but different customer data.

USE DBA
GO
CREATE PROCEDURE dbo.IndexMaint_CompressIndexes
(
@Debug bit =
)
AS
SET NOCOUNT ON

DECLARE @DatabaseName NVARCHAR(128),
	@SQL NVARCHAR(MAX)

DECLARE CursorDatabase CURSOR
FOR
SELECT name
FROM sys.databases
WHERE database_id > 4

OPEN CursorDatabase

FETCH NEXT FROM CursorDatabase INTO @DatabaseName

WHILE ( @@fetch_status = 0 )
BEGIN
	SET @SQL = REPLACE(REPLACE(CAST( 
		'
		USE {{DatabaseName}};
	 
		DECLARE  @SchemaName VARCHAR(128),
			@TableName VARCHAR(256),
			@IndexName VARCHAR(256),
			@ColumnName VARCHAR(128),
			@IndexID AS INT,
			@IndexType AS TINYINT,
			@CompressionPercent DECIMAL(4,1),
			@SkipCompressionSparse INT,
			@ObjectID AS INT,
			@TSQLScripCreationIndex NVARCHAR(MAX),
			@LOB BIT;

		DECLARE CursorIndex CURSOR
		FOR
		SELECT DISTINCT
			SCHEMA_NAME(t.schema_id) SchemaName,
			t.name TableName,
			ix.name IndexName,
			ix.index_id,
			t.object_id,
			ix.type
		FROM
			sys.tables t
		INNER JOIN sys.indexes ix
			ON t.object_id = ix.object_id
		LEFT OUTER JOIN sys.partitions p
			ON ix.object_id = p.object_id
				AND ix.index_id = p.index_id
		WHERE
			t.is_ms_shipped = 0
			AND p.data_compression_desc = "NONE"
			AND ix.type IN (1, 2)
			AND t.is_memory_optimized = 0
		ORDER BY
			SCHEMA_NAME(t.schema_id),
			t.name,
			ix.name
			
		OPEN CursorIndex

		FETCH NEXT FROM CursorIndex INTO @SchemaName, @TableName, @IndexName, @IndexID, @ObjectID, @IndexType

		WHILE ( @@fetch_status = 0 )
		BEGIN
			SELECT @SkipCompressionSparse = COUNT(*) 
			FROM sys.columns c
			JOIN sys.objects o ON o.OBJECT_ID = c.OBJECT_ID
			WHERE is_sparse = 1
				AND o.object_id = @ObjectID

			IF @indexType = 1
			BEGIN
				SELECT @LOB = COUNT(*)
				FROM sys.columns COLUMNS
				INNER JOIN sys.types types 
					ON columns.system_type_id = types.user_type_id OR (columns.user_type_id = types.user_type_id AND types.is_assembly_type = 1)  
				WHERE [object_id] = @ObjectID
					AND types.name IN("image", "text", "ntext") ;
			END
			ELSE IF @indexType = 2
			BEGIN
				SELECT @LOB = COUNT(*) 
				FROM sys.index_columns index_columns 
				INNER JOIN sys.columns columns 
					ON index_columns.[object_id] = columns.[object_id] AND index_columns.column_id = columns.column_id 
				INNER JOIN sys.types types  ON columns.system_type_id = types.user_type_id OR (columns.user_type_id = types.user_type_id AND types.is_assembly_type = 1) 
				WHERE index_columns.[object_id] = @ObjectID 
					AND index_columns.index_id = @indexID 
					AND types.name IN("image", "text", "ntext") ;
			END

			IF @SkipCompressionSparse = 0 AND @LOB = 0
			BEGIN
				EXEC DBA.dbo.IndexMaint_CalcCompression
					@DatabaseName = "{{DatabaseName}}",
					@TableName = @TableName ,
					@SchemaName = @SchemaName ,
					@IndexID = @IndexID ,
					@CompressionPercent = @CompressionPercent OUTPUT

				IF @CompressionPercent >= 25
				BEGIN
					SET @TSQLScripCreationIndex = "ALTER INDEX [" + @IndexName + "] ON " + @SchemaName + "." + @TableName + " REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE, ONLINE = ON);"
					EXEC (@TSQLScripCreationIndex)
				END
			END

			FETCH NEXT FROM CursorIndex INTO @SchemaName, @TableName, @IndexName, @IndexID, @ObjectID, @IndexType
		END

		CLOSE CursorIndex
		DEALLOCATE CursorIndex' AS NVARCHAR(MAX)) 
			,'{{DatabaseName}}', @DatabaseName)
			,'"','''') 

	IF @Debug = 0
		EXEC (@SQL);
	ELSE
		SELECT @SQL FOR XML PATH('');
	
	FETCH NEXT FROM CursorDatabase INTO @DatabaseName
END

CLOSE CursorDatabase
DEALLOCATE CursorDatabase
GO
USE DBA
GO
CREATE PROCEDURE dbo.IndexMaint_CalcCompression
(
	@DatabaseName SYSNAME,
	@TableName SYSNAME,
	@SchemaName SYSNAME,
	@IndexID INT,
	@CompressionPercent DECIMAL(4, 1) OUTPUT
)
AS
SET NOCOUNT ON

CREATE TABLE #Compression
(
	ObjectName SYSNAME,
	SchemaName SYSNAME,
	IndexID INT,
	PartitionNumber INT,
	SizeCurrentCompression BIGINT,
	SizeRequestedCompression BIGINT,
	SampleSizeCurrent BIGINT,
	SampleSizeRequested BIGINT
);

DECLARE @SQL NVARCHAR(MAX);

SET @SQL = 'USE ' + @DatabaseName + ';
INSERT INTO #Compression
(
	ObjectName,
	SchemaName,
	IndexID,
	PartitionNumber,
	SizeCurrentCompression,
	SizeRequestedCompression,
	SampleSizeCurrent,
	SampleSizeRequested
)
EXEC sys.sp_estimate_data_compression_savings
	@schema_name = ''' + @SchemaName  + ''',
	@object_name = ''' + @TableName + ''',
	@Partition_number = NULL,
	@index_id = ' + CAST(@IndexID AS NVARCHAR(4)) + ',
	@data_compression = ''PAGE'';'

EXEC (@SQL)

SELECT
	@CompressionPercent = SUM(SizeRequestedCompression) * 1.0 / SUM(SizeCurrentCompression) * 100
FROM #Compression
HAVING
	SUM(SizeCurrentCompression) > 0 ;

DROP TABLE #Compression;
GO

 

Related Posts

6 thoughts on “Determining Which Indexes to Compress Among Thousands

  1. In line 97 the called procedure should be prefixed with the name of the database where the stored procedures resides because inside the dynamic SQL a USE statement is used and the current database changes. EXEC .dbo.IndexMaint_CalcCompression.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.