Useful transactional replication stored procedure

The scenario is when your publication settings has the option
immediate_sync configured as false (means if you run the Snapshot Agent, the snapshot files will be created for the new articles only and also a new subscriber to the existing publication, the snapshot files only get created for the new subscription)

This option immediate_sync will prevent to generate all snapshot files again when add a new article in the replication, but it still need schema lock to check all articles modification.

If you have a large number of articles and a bunch of subscribers, you might face the situation you have to add a new article in your transactional replication and after run the snapshot the article is not propagating to the subscribers.

First, check your publication settings.

exec sp_helppublication 'PublicationName'
GO

If you are using the settings immediate_sync and allow_anonymous false you can run the procedure below to solve the issue that your new article didn’t propagate to your subscribers.

exec sp_refreshsubscriptions 'PublicationName'
GO



How update works in transactional replication?

In a post about how update works I showed what happened when run an update with the same values. SQL Server is smart enough to see that and not changing anything and register minimum log.

Another day I saw a comment to avoid updating records when none of the values are changing.

Avoid updating records when none of the values are changing. This still does a write and, if the table is in replication or change tracking, still causes the row to be propagated out to other servers. If you are updating a potentially large number of records, make sure to only update the ones where the new value doesn’t equal the old value.

Let’s see how the update behavior when update 10 millions rows but without any change and see if will be any row propagation to another server.

objectexplorer

My Lab contains two servers and a demo database replicated from SQL01 to SQL02.

It’s configured a transaction replication to send the articles and I’m using a table created with the code:

CREATE TABLE tblMillionsRows (
    id BIGINT NOT NULL IDENTITY PRIMARY KEY,
    largeColumn NVARCHAR(MAX) DEFAULT REPLICATE('TESTE',100),
    smallColumn NVARCHAR(150) DEFAULT 'DEMO',
    dateColumn DATETIME
        DEFAULT GETDATE()
);
GO

Inserting rows:

INSERT INTO dbo.tblMillionsRows (
    largeColumn,
    smallColumn,
    dateColumn
)
DEFAULT VALUES
GO 2000000

It will take a while to insert all rows. The next step is add this table in the replication.

DECLARE @publication    AS sysname;
DECLARE @table AS sysname;
DECLARE @filterclause AS nvarchar(500);
DECLARE @filtername AS nvarchar(386);
DECLARE @schemaowner AS sysname;
SET @publication = N'SQL01_demo_tb01'; 
SET @table = N'tblMillionsRows';
SET @schemaowner = N'dbo';

EXEC sp_addarticle 
	@publication = @publication, 
	@article = @table, 
	@source_object = @table,
	@source_owner = @schemaowner, 
	@schema_option = 0x80030F3,
	@vertical_partition = N'true', 
	@type = N'logbased',
	@filter_clause = @filterclause;

EXEC sp_articlecolumn 
	@publication = @publication, 
	@article = @table;

EXEC sp_startpublication_snapshot 
	@publication = 'SQL01_demo_tb01', 
	@publisher =  'SQL01'   

With all set, let’s do some tests. First updating the heap table and seeing if something is replicated.

UPDATE tblMillionsRows SET smallColumn = N'DEMO' WHERE id < 10000
GO

After the update, I ran the script below to see the transactions and commands available to replicate and nothing changed.

SELECT * FROM distribution.dbo.MSrepl_Commands 
SELECT * FROM distribution.dbo.MSrepl_Transactions 
EXEC sp_browsereplcmds 

But, when I change one row running the update below I could see a new command to replicate in the another server.

UPDATE tblMillionsRows SET smallColumn = N'DEMO1' WHERE id < 10000
GO

replicationcommands

In conclusion, SQL Server won’t replicate updates that don’t change the value. You can see more about updates in my post.

Be careful when doing updates, because if the table is replicated and you change millions rows the transaction replication will create one command for each updated row.

 

Database migration methods

migration-logo-2389341When thinking about migration the most common issue is downtime. There are some methods that require downtime like backup/restore or less downtime using Log Shipping.

You can choose several different methods to migrate your database.  For example, methods that require downtime include:

  • Backup the database, manually copy the backup file to the machine and then restore the database there. This method is simple and the use of compression minimizes the time that is required.
  • Perform a backup to an Azure Blob storage account, and then restore the database from that backup. This method removes the necessity to manually copy the backup file.
  • Detach the database, copy the files to an Azure Blob storage and then attach them to the SQL Server instance. Use this method if you plan to store database files in Azure Blob storage permanently instead of on a hard disk.
  • Log Shipping, use a backup/restore process and the downtime depends on the transaction log file backup copying a small file in the process. This method is simple and is necessary a connection between the machines and permission to copy the backup files.

Methods that not require downtime:

  • AlwaysOn, adding a new secondary replica after replication has completed, you can failover to make the machine the primary replica. For Azure virtual machine use the Add Azure Replica Wizard.
  • Transactional Replication will minimize downtime, but don’t have an AlwaysOn deployment in your SQL Server system.

If you need help to decide which method is better for your scenario, feel free to contact me.

SqlPackage a tool to import/export SQL Server and Azure SQL DB

repair_database-512SqlPackage is a command-line utility that you can use for exporting and importing operations in both on-premises SQL Server databases and in cloud databases. SqlPackage supports the following operations:

  • Extract. Creates a database snapshot DACPAC file from a SQL Server database or from Azure SQL Database.
  • Publish. Updates the schema in a live database to match the schema in a DACPAC
    file. If the database does not exist on the destination server, the publish operation
    creates it.
  • Export. Exports both schema and data from a SQL Server database or from Azure SQL Database into a BACPAC file.
  • Import. Imports the schema and data from a BACPAC into a new database.
  • DeployReport. Creates an XML report that describes the changes that would be made by a publish operation.
  • DriftReport. Creates an XML report of the changes that have been made to a registered database.
  • Script. Creates a Transact-SQL script that you can use to update the schema of a target database to match the schema of a source database.
    Use the /Action: or /a parameter to specify which action to execute.

SqlPackage example to import from bacpac file to Azure
sqlpackage.exe /Action:Import /tsn:tcp:.database.windows.net,1433 /tdn: /tu: /tp: /sf: /p:DatabaseEdition=Premium /p:DatabaseServiceObjective=P4 /p:Storage=File

More about SqlPackage parameters:
https://msdn.microsoft.com/en-us/library/hh550080%28v=vs.103%29.aspx?f=255&MSPPError=-2147217396

What is Stretch Database?

Stretch Database is a feature of SQL Server where data can be split between on-premises storage and cloud storage. With Stretch Database, cold, historical data is kept in the cloud and active data is kept on-premises for maximum performance.

Stretch Database requires no changes to client applications or existing Transact-SQL queries, so you can implement it seamlessly for existing applications. Stretch Database can reduce on-premises storage requirements both for data and associated backups. Backups of on-premises data are smaller and therefore run quicker than standard backups. Data in the cloud is backed up automatically.

With Stretch Database, cold historic data remains available for users to query, although there might be a small amount of additional latency associated with queries.

img2

Implement a Stretch Database
You can implement Stretch Database entirely within SQL Server Management Studio; you do not need to pre-configure servers or storage within Microsoft Azure.
Implementing Stretch Database involves the following steps:

  1. Start Microsoft SQL Server Management Studio and connect to the instance of SQL Server.
  2. In Object Explorer, expand Databases.
  3. Right-click the database, point to Tasks, point to Stretch, and then click Enable.

scretch

scretch2

scretch3scretch4scretch5Complete the steps in the Enable Database for Stretch wizard to create a  Database Master Key; identify the appropriate tables and configure the Microsoft Azure deployment.

After implementing Stretch Database, you can monitor it from SQL Server  Management Studio.

In Object Explorer, expand Databases, right-click the stretch-enabled database, point to Tasks, point to Stretch, and then click Monitor to open the Stretch Database Monitor. This monitor shows information about both the local and Azure SQL instances, along with data migration status.

Moving the msdb, model, and tempdb databases files

All system databases, except the resource database, can be moved to new locations to help balance I/O load.

To move the msdb, model, and tempdb databases, perform the following steps:

  • For each file to be moved, execute the ALTER DATABASE … MODIFY FILE statement.
  • Stop the instance of SQL Server.
  • Move the files to the new location (this step is not necessary for tempdb, as its files are recreated automatically on startup).
  • Restart the instance of SQL Server.

The process for moving the master database is different from the process for other databases. To move the master database, perform the following steps:

  • Open SQL Server Configuration Manager.
  • In the SQL Server Services node, right-click the instance of SQL Server, click Properties, and then click the Startup Parameters tab.
  • Edit the Startup Parameters values to point to the planned location for the master database data (-d parameter) and log (-l parameter) files.
  • Stop the instance of SQL Server.
  • Move the master.mdf and mastlog.ldf files to the new location.
  • Restart the instance of SQL Server

 

TempDB summary

 

TempDB-Defaults-e1452024871991
The new tempdb tab in SQL server

Tempdb is a special database available as a resource to all users of a SQL Server instance, you use it to hold temporary objects that users, or the database engine, create.

In many respects, tempdb files are identical to the files that make up other SQL Server databases. From the perspective of storage I/O, tempdb uses the same file structure as a user database one or more data files and a log file. The arrangement of data pages within tempdb data files is also based on the same architecture as user databases.
Unlike all other databases, SQL Server recreates the tempdb database each time the SQL Server service starts. This is because tempdb is a temporary store.
There are three primary ways that the organization of tempdb files can affect system performance:

  • Because users and the database engine both use tempdb to hold large temporary objects, it is common for tempdb memory requirements to exceed the capacity of the buffer pool in which case, the data will spool to the I/O subsystem. The performance of the I/O subsystem that holds tempdb data files can therefore significantly impact the performance of the system as a whole. If the performance of tempdb is a bottleneck in your system, you might decide to place tempdb files on very fast storage, such as an array of SSDs.
  • Although it uses the same file structure, tempdb has a usage pattern unlike user databases. By their nature, objects in tempdb are likely to be short-lived, and might be created and dropped in large numbers. Under certain workloads especially those that make heavy use of temporary objects this can lead to heavy contention for special system data pages, which can mean a significant drop in
    performance. One mitigation for this problem is to create multiple data files for tempdb; this is covered in more detail in the next topic.
  • When SQL Server recreates the tempdb database following a restart of the SQL Server service, the size of the tempdb files returns to a preconfigured value. The tempdb data files and log file are configured to autogrow by default, so if subsequent workloads require more space in tempdb than is currently available, SQL Server will request more disk space from the operating system. If the initial
    size of tempdb and the autogrowth increment set on the data files is small, SQL Server might need to request additional disk space for tempdb many times before it reaches a stable size.

Difference between Lock and Latch

Locks everywhere is a good start to understand how SQL Server provides logical consistency. Every operation has a lock and latch.

What does Latch mean? Latch protects memory on Buffer Pool, is a method that provides physical consistency.

SQL Server does operations in memory, that means, it read the page from disk and put that page on buffer pool to work there. If someone tries to update the data, the page is changed in memory and SQL Server writes the change in the transaction log file. (More about logging)

Basically, Latches are physical locks and hold the lock only for the duration of the physical operation, while the Locks are logical and maintain the lock until the transaction finishes. Both types guarantee data consistency.

LATCH

SQL Server threads architecture

execution-pipeline

SQL Server retrieves threads from Windows. The SQL Server configuration setting max worker threads (set at instance level) determines how many threads will be retrieved. SQL Server has its own internal scheduling system, independent of the scheduling performed by the operating system.

Instead of using Windows threads directly, SQL Server creates a pool of worker threads that are mapped to Windows threads whenever work needs to be performed.
When a SQL Server component needs to execute code, the component creates a task that represents the unit of work to be done.

For example, if you send a batch of Transact-SQL commands to the server, it’s
likely that the batch will be executed within a task.
When a SQL Server component creates a task, it is assigned the next available worker thread that is not in use. If no worker threads are available, SQL Server will try to retrieve another Windows thread, up to the point that the max worker threads configuration limit is reached.

At that point, the new task would need to wait to get a worker thread. THREADPOOL wait type is when there are no available threads in the server’s thread pool, during parallelism multiple worker threads are required. So, this wait type ir related with CXPACKET.

I suggest reducing parallelism not setting MAXDOP 1. Try to increase Cost threshold for parallelism if the setting is too low (default is 5) and analyze the queries with parallelism, see the indexes and statistics to understand if the query optimizer is estimating right.

 

Wait Statistics

The first post about waits on SQL Server was regarding what SQL Server Wait means, a brief explanation and the concept might be difficult to catch at first. When the task needs to wait for a resource, it is placed on a list until the resource is available.

SQL Server keeps detailed internal records of how long the tasks spend wainting and the types of resources they are waiting. Wait statistics information can be useful resource for troubleshooting performance problems. I like to start with wait stats information to understand the environment I’m working.

To see the Wait stats details, you can query the following system views:

sys.dm_os_waiting_tasks -> Returns information about the wait queue 
of tasks that are waiting on some resource
sys_dm_os_wait_stats -> Returns information about all the waits 
encountered by threads that executed. 
You can use this aggregated view to diagnose performance issues with 
SQL Server and also with specific queries and batches. 
sys.dm_exec_session_wait_stats (Transact-SQL)

I have scripts on Git I gatter on internet to query wait stats in a easy way to read.