Method to update LOB data with minimal log

Use the .WRITE (expression, @Offset,@Length) clause to perform a partial or full update of varchar(max)nvarchar(max), and varbinary(max) data types.

For example, a partial update of a varchar(max) column might delete or modify only the first 200 characters of the column, whereas a full update would delete or modify all the data in the column. .WRITE updates that insert or append new data are minimally logged if the database recovery model is set to bulk-logged or simple.

UPDATE <table_or_view_name>
SET column_name.WRITE (expression, @offset, @length)
FROM <table_source>
WHERE <search_condition>

expression is the value that is copied to column_nameexpression must evaluate to or be able to be implicitly cast to the column_name type. If expression is set to NULL, @Length is ignored, and the value in column_nameis truncated at the specified @Offset.

@Offset is the starting point in the value of column_name at which expression is written. @Offset is a zero-based ordinal position, is bigint, and cannot be a negative number. If @Offset is NULL, the update operation appends expression at the end of the existing column_name value and @Length is ignored. If @Offset is greater than the length of the column_name value, the Database Engine returns an error. If @Offset plus @Length exceeds the end of the underlying value in the column, the deletion occurs up to the last character of the value. If @Offset plus LEN(expression) is greater than the underlying declared size, an error is raised.

@Length is the length of the section in the column, starting from @Offset, that is replaced by expression@Length is bigint and cannot be a negative number. If @Length is NULL, the update operation removes all data from @Offset to the end of the column_name value.

The regular update statement results in overwriting the entire string using full logging and it’s very inefficient when dealing with large value updates.

Note: The update using WRITE method will fail if the value is null.

More information about performance in my tip on MSSQLTips

 

 

Troubleshooting Transactional Replication – part 1

The common way to replicate data and objects from one server to another is using transactional replication. As you know, SQL Server processes all actions within the database using Transact-SQL statements. Each completed statement is called a transaction. In transactional replication, each committed transaction is replicated to the subscriber as it occurs.

trnsact
“Sometimes” can occur a problem and the process don’t work accordingly. The fist step I like to do is run sp_whoisactive in the servers (Publisher, Distributor and Subscriber), you can see if there are any process blocking the replication.

Next, is to see the publisher and subscriber agents, if there is any issue they will show, like for example login locked.

To have more information what’s happening I like to query the system view MSrepl_Transactions in the distribution database, this view contains one row for each replicated transaction and with that information run the procedure sp_browsereplcmds specifying the seqno start @xact_seqno_start = N’0x0000014F000034B0001C’ for example.

The procedure sp_browserreplcmds returns a record set with the commands stored in the distribution database. It show like a table like below.

2018-07-29_20-31-59

How update works in transactional replication?

In a post about how update works I showed what happened when run an update with the same values. SQL Server is smart enough to see that and not changing anything and register minimum log.

Another day I saw a comment to avoid updating records when none of the values are changing.

Avoid updating records when none of the values are changing. This still does a write and, if the table is in replication or change tracking, still causes the row to be propagated out to other servers. If you are updating a potentially large number of records, make sure to only update the ones where the new value doesn’t equal the old value.

Let’s see how the update behavior when update 10 millions rows but without any change and see if will be any row propagation to another server.

objectexplorer

My Lab contains two servers and a demo database replicated from SQL01 to SQL02.

It’s configured a transaction replication to send the articles and I’m using a table created with the code:

CREATE TABLE tblMillionsRows (
    id BIGINT NOT NULL IDENTITY PRIMARY KEY,
    largeColumn NVARCHAR(MAX) DEFAULT REPLICATE('TESTE',100),
    smallColumn NVARCHAR(150) DEFAULT 'DEMO',
    dateColumn DATETIME
        DEFAULT GETDATE()
);
GO

Inserting rows:

INSERT INTO dbo.tblMillionsRows (
    largeColumn,
    smallColumn,
    dateColumn
)
DEFAULT VALUES
GO 2000000

It will take a while to insert all rows. The next step is add this table in the replication.

DECLARE @publication    AS sysname;
DECLARE @table AS sysname;
DECLARE @filterclause AS nvarchar(500);
DECLARE @filtername AS nvarchar(386);
DECLARE @schemaowner AS sysname;
SET @publication = N'SQL01_demo_tb01'; 
SET @table = N'tblMillionsRows';
SET @schemaowner = N'dbo';

EXEC sp_addarticle 
	@publication = @publication, 
	@article = @table, 
	@source_object = @table,
	@source_owner = @schemaowner, 
	@schema_option = 0x80030F3,
	@vertical_partition = N'true', 
	@type = N'logbased',
	@filter_clause = @filterclause;

EXEC sp_articlecolumn 
	@publication = @publication, 
	@article = @table;

EXEC sp_startpublication_snapshot 
	@publication = 'SQL01_demo_tb01', 
	@publisher =  'SQL01'   

With all set, let’s do some tests. First updating the heap table and seeing if something is replicated.

UPDATE tblMillionsRows SET smallColumn = N'DEMO' WHERE id < 10000
GO

After the update, I ran the script below to see the transactions and commands available to replicate and nothing changed.

SELECT * FROM distribution.dbo.MSrepl_Commands 
SELECT * FROM distribution.dbo.MSrepl_Transactions 
EXEC sp_browsereplcmds 

But, when I change one row running the update below I could see a new command to replicate in the another server.

UPDATE tblMillionsRows SET smallColumn = N'DEMO1' WHERE id < 10000
GO

replicationcommands

In conclusion, SQL Server won’t replicate updates that don’t change the value. You can see more about updates in my post.

Be careful when doing updates, because if the table is replicated and you change millions rows the transaction replication will create one command for each updated row.

 

Locks everywhere

LOCKSIn this post I’m going to talk about locks on SQL Server. Locks are necessary, they are used in all operations in the database. Don’t get confused about blocking, locking and blocking are totally different.

When we talk about lock, doing something in the database, like an update and select though will cause a type of lock. The select stantement has a lock operation called shared lock. This means you can share reads with someone else and that may not cause blocks.

SQL Server has different kinds of lock modes, such as (S) Shared, (U) Update, (X) Exclusive, (I) Intent (Sch) Schema, Bulk Update and Key-Range.

  • (S) Shared lock is used in read operations.
  • (U) Update to avoid potential deadlock problem.
  • (X) Exclusive prevent access to a resource by concurrent transactions.
  • (I) Intent prevent other transactions from modifying the higher-level resource and improve the efficiency of the Database Engine in detecting lock conflicts at the higher level.
  • (Sch) Schema uses schema modification (Sch-M) locks during a table data definition language (DDL) operation.

The following table shows the compatibility of the most commonly encountered lock modes.

Existing granted mode
Requested mode IS S U IX SIX X
Intent shared (IS) Yes Yes Yes Yes Yes No
Shared (S) Yes Yes Yes No No No
Update (U) Yes Yes No No No No
Intent exclusive (IX) Yes No No Yes No No
Shared with intent exclusive (SIX) Yes No No No No No
Exclusive (X) No No No No No No

 

Simple way to understand transaction isolation levels

I was talking with a friend about how a database works and she asked me to explain transaction isolation levels to her. So let’s try…

My first intent was to create an analogy with something simple, for example, a book. Two people trying to read/write on the same book. They can read at the same time and it won’t cause any conflict. (shared lock). However, if one tries to write while the other one is reading, depending on the level of isolation that is being used, one of them may be blocked.

Imagine the situation where a person A finished writing page 1 of a book, so person B can read that page. The transaction was open when person A started writing and closed when page 1 was finished. This situation is called READ COMMITTED, because the person B can only read the page 1 after the person A finish writing and closes the transaction. This is SQL Server default isolation level.

Another situation is when the person B is reading the page 1 while the person A is writing, it is called READ UNCOMMITTED. The person A didn’t close the transaction and while the person B is reading something may be changed by person A. If the person B tries to read it again, it might be different because the transaction is still opened for person A.

Next case is when the person B starts to read the page first and person A tries to change what the person B is reading. The isolation level REPEATABLE READ will maintain what the person B is reading until the transaction is finished. That means if the person B tries to read again he will see the same data and nobody can change it. In this case person A will still be able to read and add new pages.

The last level is SERIALIZABLE and this scenario the person B is reading and the person A tries to write a new page, but in this case, the person A will only be able to insert/delete/update any page after person B closes his transaction. Both person A and B can read because the shared lock is compatible. It’s similar to the REPEATABLE READ, however the difference here is that you can’t do any changes in the book.

Output

Memory-optimized Logging

In a previous post I talked about transaction log works, and what about using a memory-optimized table?

SQL Server has the feature Memory-Optimized Objects to improve performance. In-memory nonclustered indexes are implemented using a data structure called a Bw-Tree. A Bw-Tree is a lock and latch-free variation of a B-Tree.

In-memory architecture:inmemory

To enable an application to use In-Memory OLTP, you need to complete the following tasks:

  • Create a memory-optimized data filegroup and add a container to the filegroup.
  • Create memory-optimized tables and indexes.
  • Load data into the memory-optimized table and update statistics after loading the data and before creating the compiled stored procedures.
  • Create natively compiled stored procedures to access data in memory-optimized tables. You can also use a traditional, interpreted Transact-SQL to access data in memory-optimized tables.

  • As needed, migrate data from existing tables to memory-optimized tables.

In this new architecture, let’s see what happens in the logging level.

I created a table using the code:

CREATE TABLE InMemoryTable (
 ID INT IDENTITY NOT NULL PRIMARY KEY NONCLUSTERED
 , FIRST_NAME VARCHAR(100)
) WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)

As you can see, the first part is equal to a normal table and to create the in-memory table we use MEMORY_OPTIMIZED and the DURABILITY. In this case, I’d like you to pay attention to the DURABILITY configured to SCHEMA_ONLY.

Let’s take a look in the transaction log after create the table.inmemotable

We can see all the sys changes to create the table. Now, inserting a row in this table and see the log again:

INSERT INTO InMemoryTable VALUES('Douglas Correa')
GO

inmemotable

You can see nothing changed, but where’s my data? The data is there in the table but only in memory. As you can imagine, both the logging and saving the data to disk are expensive operations.

That means with DURABILITY schema_only the data won’t be there after a crash or restart the server. Changing that for SCHEMA_AND_DATA and look in the log file we are going to see the log operation when inserting data.

inmemotable2

As you can see, the log operation is LOP_HK, the row is in the log in case of crash SQL Server can redo.

Conclusion

The memory-optimized table is fast and can improve performance especially if don’t need to save the data, but there are limitations and one of limitation I didn’t like was I can’t detach and attach the database recreating the log file.

 

What the Checkpoint does?

Checkpoints flush dirty data pages from the buffer cache of the current database to disk. This minimizes the active portion of the log that must be processed during a full recovery of a database. During a full recovery, the following types of actions are performed:

  • The log records of modifications not flushed to disk before the system stopped are rolled forward.
  • All modifications associated with incomplete transactions, such as transactions for which there is no COMMIT or ROLLBACK log record, are rolled back.

Checkpoints occur in the following situations:

  • A CHECKPOINT statement is explicitly executed. A checkpoint occurs in the current database for the connection.
  • A minimally logged operation is performed in the database; for example, a bulk-copy operation is performed on a database that is using the Bulk-Logged recovery model.
  • Database files have been added or removed by using ALTER DATABASE.
  • An instance of SQL Server is stopped by a SHUTDOWN statement or by stopping the SQL Server (MSSQLSERVER) service. Either action causes a checkpoint in each database in the instance of SQL Server.
  • An instance of SQL Server periodically generates automatic checkpoints in each database to reduce the time that the instance would take to recover the database.
  • A database backup is taken.
  • An activity requiring a database shutdown is performed. For example, AUTO_CLOSE is ON and the last user connection to the database is closed, or a database option change is made that requires a restart of the database.

recovery_process

VLF (Virtual Log Files)

tranlog3

To see how many VLFs you have solely look at the number of rows returned by DBCC LOGINFO.

The size and number of VLFs you’ll have depends largely on the size that the chunk is when it’s added to you transaction log.

There is no general rule how to determine the best values for the auto-growth option, as these vary from case to case. Having too many or too little virtual log files causes bad performance.

Having an excessive number of VLFs can negatively impact all transaction log related activities and you may even see degradation in performance when transaction log backups occur.

Most of the time excessive VLF fragmentation is brought about by excessive file growth at small intervals. For example, a database that is set to grow a transaction log file by 5mb at a time is going to have a large number of VLFs should the log decide to grow.

Growth Number of VLFs created
<= 64Mb 4
>64 but <=1Gb 8
>1Gb 16

There is insufficient system memory in resource pool

Doing crash and recovery tests on my local machine I got the SQL Server instance not going online. After trying the third time to bring my instance online thinking was something else problem I saw the SQL Server errolog file and I could see the problem.

Not enough memory, but wasn’t on my machine, was in the resource pool. So, what is a resource pool?

A resource pool represents a subset of the physical resources of an instance of the Database Engine and in my case was insufficient memory. Let’s see the errorlog file:

2018-03-14 16:19:58.09 spid56s     [ERROR] Recovery failed with error 0x83000000 on database 18. This error will be mapped to 'HK_E_RESTORE_INSUFFICIENT_MEMORY' (0x8200002e). (sql\ntdbms\hekaton\runtime\src\hkruntime.cpp : 4805 - 'HkRtRestoreDatabase')
2018-03-14 16:19:58.09 spid34s     [INFO] HkCkptCtrlUninitialize(): Database ID: [18]. Cleaning up StorageArray. LastClosedCheckpointEndTs: '158'
2018-03-14 16:19:58.09 Server      Error: 17300, Severity: 16, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
2018-03-14 16:19:58.09 Server      Error: 17312, Severity: 16, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
2018-03-14 16:19:58.09 Server      Error: 28709, Severity: 16, State: 19. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
2018-03-14 16:19:58.09 spid37s     Error: 701, Severity: 17, State: 137.
2018-03-14 16:19:58.09 spid37s     There is insufficient system memory in resource pool 'default' to run this query.
2018-03-14 16:19:58.09 spid39s     Error: 701, Severity: 17, State: 137.
2018-03-14 16:19:58.09 spid39s     There is insufficient system memory in resource pool 'default' to run this query.
2018-03-14 16:19:58.11 spid55s     [ERROR] Recovery failed with error 0x83000000 on database 15. This error will be mapped to 'HK_E_RESTORE_INSUFFICIENT_MEMORY' (0x8200002e). (sql\ntdbms\hekaton\runtime\src\hkruntime.cpp : 4805 - 'HkRtRestoreDatabase')
2018-03-14 16:19:58.11 spid31s     [INFO] HkCkptCtrlUninitialize(): Database ID: [15]. Cleaning up StorageArray. LastClosedCheckpointEndTs: '155'
2018-03-14 16:19:58.20 spid31s     SQL Server shutdown has been initiated
2018-03-14 16:19:58.21 spid31s     Error: 19032, Severity: 10, State: 1. (Params:). The error is printed in terse mode because there was error during formatting. Tracing, ETW, notifications etc are skipped.
2018-03-14 16:19:58.28 spid34s     SQL Server shutdown has been initiated

After starting the service SQL Server was doing the redo and undo process, this means it was reading the log files, create the compensate log records if was found any uncommitted transaction.

SQL Server will need memory in buffer pool to complete the redo and undo process and I didn’t remember I changed any SQL Server memory configuration.

So, my approach was to connect SQL Server via command line while the instance was still up and run sp_configure to see how much memory was configured. I got only 512mb set for Max Server Memory and that was the problem. (Max server memory controls the SQL Server memory allocation, compile memory, all caches (including the buffer pool), query execution memory grants, lock manager memory, and CLR memory).

In my environment with 26 databases and my crash recovery tests, 512mb for my pool memory wasn’t enough and when I changed the configuration to 4096mb I could bring the instance online again.

Conclusion

First, read the errorlog file to have more information what SQL Server is doing and also know transaction log operations, log records, checkpoints and how crash  recovery works is fundamental.

SQL Server as a process acquires more memory than specified by max server memory option. Both internal and external components can allocate memory outside of the buffer pool, which consumes additional memory, but the memory allocated to the buffer pool usually still represents the largest portion of memory consumed by SQL Server.

 

SQL Server Undocumented 2

One more undocumented function/commands. The fn_dump_dblog function is used to read transaction logs from a log backup file.

This is an example fo the undocumented function to read transaction logs:

SELECT [Current LSN], [Operation], [Transaction Name], [Transaction ID], SUSER_SNAME ([Transaction SID]) AS DBUser 
FROM fn_dump_dblog (
            NULL, NULL, N'DISK', 1, N'C:\temp\Backup_Logfile.trn',
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)

A good example to use it is when I tried to restore up to specific LSN using the command

RESTORE LOG  WITH STOPBEFOREMARK = ‘lsn:’;

To read your current log file the function fn_dblog is used. For more information about this function click here

Follow bellow the operations is returned for the function:

OPERATION DESCRIPTION
LOP_ABORT_XACT   Indicates that a transaction was aborted and rolled back.
LOP_BEGIN_CKPT  A checkpoint has begun.
LOP_BEGIN_XACT  Indicates the start of a transaction.
LOP_BUF_WRITE Writing to Buffer.
LOP_COMMIT_XACT Indicates that a transaction has committed.
LOP_COUNT_DELTA  ?
LOP_CREATE_ALLOCCHAIN New Allocation chain
LOP_CREATE_INDEX Creating an index.
LOP_DELETE_ROWS Rows were deleted from a table.
LOP_DELETE_SPLIT  A page split has occurred. Rows have moved physically.
LOP_DELTA_SYSIND   SYSINDEXES table has been modified.
LOP_DROP_INDEX Dropping an index.
LOP_END_CKPT Checkpoint has finished.
LOP_EXPUNGE_ROWS Row physically expunged from a page, now free for new rows.
LOP_FILE_HDR_MODIF   SQL Server has grown a database file.
LOP_FORGET_XACT Shows that a 2-phase commit transaction was rolled back.
LOP_FORMAT_PAGE   Write a header of a newly allocated database page.
LOP_HOBT_DDL  ?
LOP_HOBT_DELTA  ?
LOP_IDENT_NEWVAL Identity’s New reseed values
LOP_INSERT_ROWS   Insert a row into a user or system table.
LOP_LOCK_XACT
LOP_MARK_DDL Data Definition Language change – table schema was modified.
LOP_MARK_SAVEPOINT Designate that an application has issued a ‘SAVE TRANSACTION’ command.
LOP_MIGRATE_LOCKS
LOP_MODIFY_COLUMNS    Designates that a row was modified as the result of an Update command.
LOP_MODIFY_HEADER   A new data page created and has initialized the header of that page.
LOP_MODIFY_ROW   Row modification as a result of an Update command.
LOP_PREP_XACT Transaction is in a 2-phase commit protocol.
LOP_SET_BITS
LOP_SET_BITS Designates that the DBMS modified space allocation bits as the result of allocating a new extent.
LOP_SET_FREE_SPACE   Designates that a previously allocated extent has been returned to the free pool.
LOP_SORT_BEGIN  A sort begins with index creation. – SORT_END end of the sorting while creating an index.
LOP_SORT_EXTENT Sorting extents as part of building an index.
LOP_UNDO_DELETE_SPLIT The page split process has been dumped.
LOP_XACT_CKPT During the Checkpoint, open transactions were detected.