Understanding parallelism on SQL Server

Parallelism refers to multiple processors cooperating to execute a single query at the same time.

Parallel execution involves the overhead of synchronizing and monitoring the tasks. So, that’s why parallel plans are considered expensive operations.

The query optimizer determines whether a parallel plan should be used based on the configuration and the query cost.

  • A configuration value, max degree of parallelism determines how many CPUs can be used to execute a query. Also, using query hint can set how many CPUs can be used for that specific query.
  • Cost threshold for parallelism determines the cost that a query must meet before a parallel query plan will even be considered. Query cost is determined based on the amount of data the query optimizer estimates to be read to complete the operation.

In earlier versions of SQL Server, it was common to disable parallel queries on systems that were primarily used for transaction processing. It’s common to see MAXDOP 1 for OLTP environments. I prefer to raise the cost threshold for parallelism so a parallel plan is only considered for higher cost queries.

Let’s see an example.

I executed the query below and I got parallelism for that. My query cost was 374.148, remember this cost is always estimated.

SELECT * FROM dbo.FactInternetSales AS FIS
LEFT JOIN dbo.DimCustomer AS DC ON DC.CustomerKey = FIS.CustomerKey
LEFT JOIN dbo.DimProduct AS DP ON DP.ProductKey = FIS.ProductKey
LEFT JOIN dbo.DimProductSubcategory AS DPS ON DPS.ProductSubcategoryKey = DP.ProductSubcategoryKey
ORDER BY DP.StandardCost
OPTION(RECOMPILE)

parallel

Increasing the cost threshold for parallelism value to 400 you are not going to see any change in the query, because the real cost isn’t 374, that cost is for only 1 thread.

In my case the max degree of parallelism value is 2, so multiplying 374 x 2 = 748

sp_configure

Setting the value to 800 and run the same query again you can see no parallelism and the cost is 691.208, so it fits in the cost value.

parallel2

If your CXPACKET wait counter is high, try to increase the cost threshold for parallelism and not set MAXDOP to 1 because some queries even in a transaction database will have the benefit to run in parallel.

TempDB summary

 

TempDB-Defaults-e1452024871991
The new tempdb tab in SQL server

Tempdb is a special database available as a resource to all users of a SQL Server instance, you use it to hold temporary objects that users, or the database engine, create.

In many respects, tempdb files are identical to the files that make up other SQL Server databases. From the perspective of storage I/O, tempdb uses the same file structure as a user database one or more data files and a log file. The arrangement of data pages within tempdb data files is also based on the same architecture as user databases.
Unlike all other databases, SQL Server recreates the tempdb database each time the SQL Server service starts. This is because tempdb is a temporary store.
There are three primary ways that the organization of tempdb files can affect system performance:

  • Because users and the database engine both use tempdb to hold large temporary objects, it is common for tempdb memory requirements to exceed the capacity of the buffer pool in which case, the data will spool to the I/O subsystem. The performance of the I/O subsystem that holds tempdb data files can therefore significantly impact the performance of the system as a whole. If the performance of tempdb is a bottleneck in your system, you might decide to place tempdb files on very fast storage, such as an array of SSDs.
  • Although it uses the same file structure, tempdb has a usage pattern unlike user databases. By their nature, objects in tempdb are likely to be short-lived, and might be created and dropped in large numbers. Under certain workloads especially those that make heavy use of temporary objects this can lead to heavy contention for special system data pages, which can mean a significant drop in
    performance. One mitigation for this problem is to create multiple data files for tempdb; this is covered in more detail in the next topic.
  • When SQL Server recreates the tempdb database following a restart of the SQL Server service, the size of the tempdb files returns to a preconfigured value. The tempdb data files and log file are configured to autogrow by default, so if subsequent workloads require more space in tempdb than is currently available, SQL Server will request more disk space from the operating system. If the initial
    size of tempdb and the autogrowth increment set on the data files is small, SQL Server might need to request additional disk space for tempdb many times before it reaches a stable size.

Difference between Lock and Latch

Locks everywhere is a good start to understand how SQL Server provides logical consistency. Every operation has a lock and latch.

What does Latch mean? Latch protects memory on Buffer Pool, is a method that provides physical consistency.

SQL Server does operations in memory, that means, it read the page from disk and put that page on buffer pool to work there. If someone tries to update the data, the page is changed in memory and SQL Server writes the change in the transaction log file. (More about logging)

Basically, Latches are physical locks and hold the lock only for the duration of the physical operation, while the Locks are logical and maintain the lock until the transaction finishes. Both types guarantee data consistency.

LATCH

Sending files to AWS S3 using Powershell

Amazon has a Powershell module to manage the the principal services available. I’ve been working with EC2, RDS and S3 and I wrote a tip to fast copy data to S3 and I created the function bellow to help to send files to S3.

I’m using the function bellow to send my backups to S3. It’s configured to send all files in the paths passed by parameter

Import-Module AWSPowerShell

# Author: Douglas Correa

function Send-S3Files {
 <#
 .SYNOPSIS
 Send the files from a local path to AWS S3
 .DESCRIPTION
 The function will copy all files from a specific path to AWS S3
 .EXAMPLE
 Send-S3Files -BucketName 'backups' -Region 'sa-east-1' -AKey '####' -SKey '####' -LocalSource 'c:\temp' , 'd:\backups' 
 #>
 [CmdletBinding()]
 Param (
       [Parameter(Mandatory=$True, ValueFromPipeline=$False, HelpMessage='Name of the bucket in AWS')][string]$BucketName
     , [Parameter(Mandatory=$True, ValueFromPipeline=$False, HelpMessage='Region used in AWS')][string]$Region
     , [Parameter(Mandatory=$True, HelpMessage='AWS access key')][string]$Akey
     , [Parameter(Mandatory=$True, HelpMessage='AWS secret key')][string]$SKey
     , [Parameter(Mandatory=$True, ValueFromPipeline=$False, HelpMessage='Local machine paths')][string[]]$LocalSource
 )

    process {

        Initialize-AWSDefaultConfiguration -AccessKey $AKey -SecretKey $SKey -Region $region

        foreach($source in $sources) {
            Set-Location $source
            $files = Get-ChildItem '*.*' | Select-Object -Property Name #get all files in the folder

            try {
            if(Test-S3Bucket -BucketName $bucket) {
                foreach($file in $files) {
                    if(!(Get-S3Object -BucketName $bucket -Key $file.Name)) {
                        Write-Host "Copying file : $file "
                        Write-S3Object -BucketName $bucket -File $file.Name -Key $file.Name -CannedACLName private -region $region
                    } 
                }
            } Else {
                Write-Host "The bucket $bucket does not exist."
            }
        } catch {
            Write-Host "Error uploading file $file"
            $Error
        }
    }
}

}

 

SQL Server threads architecture

execution-pipeline

SQL Server retrieves threads from Windows. The SQL Server configuration setting max worker threads (set at instance level) determines how many threads will be retrieved. SQL Server has its own internal scheduling system, independent of the scheduling performed by the operating system.

Instead of using Windows threads directly, SQL Server creates a pool of worker threads that are mapped to Windows threads whenever work needs to be performed.
When a SQL Server component needs to execute code, the component creates a task that represents the unit of work to be done.

For example, if you send a batch of Transact-SQL commands to the server, it’s
likely that the batch will be executed within a task.
When a SQL Server component creates a task, it is assigned the next available worker thread that is not in use. If no worker threads are available, SQL Server will try to retrieve another Windows thread, up to the point that the max worker threads configuration limit is reached.

At that point, the new task would need to wait to get a worker thread. THREADPOOL wait type is when there are no available threads in the server’s thread pool, during parallelism multiple worker threads are required. So, this wait type ir related with CXPACKET.

I suggest reducing parallelism not setting MAXDOP 1. Try to increase Cost threshold for parallelism if the setting is too low (default is 5) and analyze the queries with parallelism, see the indexes and statistics to understand if the query optimizer is estimating right.

 

How SQL Server stores data?

SQL Server stores data in heaps or b-tree structures. Heaps are unordered set and balanced trees are ordered by their keys.

Heaps and b-tree use collection of pages within the structure and it’s called allocation units.

  • IN_ROW_DATA -> contains all data.
  • LOB_DATA -> structure for large objects used to stored in xml, varchar(max), nvarchar(max), varbinary(max) data types.
  • ROW_OVERFLOW_DATA -> when the data type is variable like varchar and nvarchar exceed 8,060 bytes size limit per row.

To see this information is used the system view allocation_units.

allocation_units

SELECT OBJECT_NAME(I.object_id)
 , I.name
 , AU.total_pages
 , AU.used_pages
 , AU.data_pages
 , P.rows
 , AU.type_desc
 FROM sys.allocation_units AS AU
 INNER JOIN sys.partitions AS P ON AU.container_id = P.partition_id
 INNER JOIN sys.indexes AS I ON I.index_id = P.index_id AND I.object_id = P.object_id
 WHERE P.object_id = OBJECT_ID('TabelaDeTamanhoVariavel')

 

Column name Data type Description
allocation_unit_id bigint ID of the allocation unit. Is unique within a database.
type tinyint Type of allocation unit:

0 = Dropped

1 = In-row data (all data types, except LOB data types)

2 = Large object (LOB) data (text, ntext, image, xml, large value types, and CLR user-defined types)

3 = Row-overflow data

type_desc nvarchar(60) Description of the allocation unit type:

  • DROPPED
  • IN_ROW_DATA
  • LOB_DATA
  • ROW_OVERFLOW_DATA
container_id bigint ID of the storage container associated with the allocation unit.

If type = 1 or 3, container_id = sys.partitions.hobt_id.

If type is 2, then container_id = sys.partitions.partition_id.

0 = Allocation unit marked for deferred drop

data_space_id int ID of the filegroup in which this allocation unit resides.
total_pages bigint Total number of pages allocated or reserved by this allocation unit.
used_pages bigint Number of total pages actually in use.
data_pages bigint Number of used pages that have:

  • In-row data
  • LOB data
  • Row-overflow data

Value returned excludes internal index pages and allocation-management pages.

 

Wait Statistics

The first post about waits on SQL Server was regarding what SQL Server Wait means, a brief explanation and the concept might be difficult to catch at first. When the task needs to wait for a resource, it is placed on a list until the resource is available.

SQL Server keeps detailed internal records of how long the tasks spend wainting and the types of resources they are waiting. Wait statistics information can be useful resource for troubleshooting performance problems. I like to start with wait stats information to understand the environment I’m working.

To see the Wait stats details, you can query the following system views:

sys.dm_os_waiting_tasks -> Returns information about the wait queue 
of tasks that are waiting on some resource
sys_dm_os_wait_stats -> Returns information about all the waits 
encountered by threads that executed. 
You can use this aggregated view to diagnose performance issues with 
SQL Server and also with specific queries and batches. 
sys.dm_exec_session_wait_stats (Transact-SQL)

I have scripts on Git I gatter on internet to query wait stats in a easy way to read.

Locks everywhere

LOCKSIn this post I’m going to talk about locks on SQL Server. Locks are necessary, they are used in all operations in the database. Don’t get confused about blocking, locking and blocking are totally different.

When we talk about lock, doing something in the database, like an update and select though will cause a type of lock. The select stantement has a lock operation called shared lock. This means you can share reads with someone else and that may not cause blocks.

SQL Server has different kinds of lock modes, such as (S) Shared, (U) Update, (X) Exclusive, (I) Intent (Sch) Schema, Bulk Update and Key-Range.

  • (S) Shared lock is used in read operations.
  • (U) Update to avoid potential deadlock problem.
  • (X) Exclusive prevent access to a resource by concurrent transactions.
  • (I) Intent prevent other transactions from modifying the higher-level resource and improve the efficiency of the Database Engine in detecting lock conflicts at the higher level.
  • (Sch) Schema uses schema modification (Sch-M) locks during a table data definition language (DDL) operation.

The following table shows the compatibility of the most commonly encountered lock modes.

Existing granted mode
Requested mode IS S U IX SIX X
Intent shared (IS) Yes Yes Yes Yes Yes No
Shared (S) Yes Yes Yes No No No
Update (U) Yes Yes No No No No
Intent exclusive (IX) Yes No No Yes No No
Shared with intent exclusive (SIX) Yes No No No No No
Exclusive (X) No No No No No No

 

Simple way to understand transaction isolation levels

I was talking with a friend about how a database works and she asked me to explain transaction isolation levels to her. So let’s try…

My first intent was to create an analogy with something simple, for example, a book. Two people trying to read/write on the same book. They can read at the same time and it won’t cause any conflict. (shared lock). However, if one tries to write while the other one is reading, depending on the level of isolation that is being used, one of them may be blocked.

Imagine the situation where a person A finished writing page 1 of a book, so person B can read that page. The transaction was open when person A started writing and closed when page 1 was finished. This situation is called READ COMMITTED, because the person B can only read the page 1 after the person A finish writing and closes the transaction. This is SQL Server default isolation level.

Another situation is when the person B is reading the page 1 while the person A is writing, it is called READ UNCOMMITTED. The person A didn’t close the transaction and while the person B is reading something may be changed by person A. If the person B tries to read it again, it might be different because the transaction is still opened for person A.

Next case is when the person B starts to read the page first and person A tries to change what the person B is reading. The isolation level REPEATABLE READ will maintain what the person B is reading until the transaction is finished. That means if the person B tries to read again he will see the same data and nobody can change it. In this case person A will still be able to read and add new pages.

The last level is SERIALIZABLE and this scenario the person B is reading and the person A tries to write a new page, but in this case, the person A will only be able to insert/delete/update any page after person B closes his transaction. Both person A and B can read because the shared lock is compatible. It’s similar to the REPEATABLE READ, however the difference here is that you can’t do any changes in the book.

Output

Memory-optimized Logging

In a previous post I talked about transaction log works, and what about using a memory-optimized table?

SQL Server has the feature Memory-Optimized Objects to improve performance. In-memory nonclustered indexes are implemented using a data structure called a Bw-Tree. A Bw-Tree is a lock and latch-free variation of a B-Tree.

In-memory architecture:inmemory

To enable an application to use In-Memory OLTP, you need to complete the following tasks:

  • Create a memory-optimized data filegroup and add a container to the filegroup.
  • Create memory-optimized tables and indexes.
  • Load data into the memory-optimized table and update statistics after loading the data and before creating the compiled stored procedures.
  • Create natively compiled stored procedures to access data in memory-optimized tables. You can also use a traditional, interpreted Transact-SQL to access data in memory-optimized tables.

  • As needed, migrate data from existing tables to memory-optimized tables.

In this new architecture, let’s see what happens in the logging level.

I created a table using the code:

CREATE TABLE InMemoryTable (
 ID INT IDENTITY NOT NULL PRIMARY KEY NONCLUSTERED
 , FIRST_NAME VARCHAR(100)
) WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)

As you can see, the first part is equal to a normal table and to create the in-memory table we use MEMORY_OPTIMIZED and the DURABILITY. In this case, I’d like you to pay attention to the DURABILITY configured to SCHEMA_ONLY.

Let’s take a look in the transaction log after create the table.inmemotable

We can see all the sys changes to create the table. Now, inserting a row in this table and see the log again:

INSERT INTO InMemoryTable VALUES('Douglas Correa')
GO

inmemotable

You can see nothing changed, but where’s my data? The data is there in the table but only in memory. As you can imagine, both the logging and saving the data to disk are expensive operations.

That means with DURABILITY schema_only the data won’t be there after a crash or restart the server. Changing that for SCHEMA_AND_DATA and look in the log file we are going to see the log operation when inserting data.

inmemotable2

As you can see, the log operation is LOP_HK, the row is in the log in case of crash SQL Server can redo.

Conclusion

The memory-optimized table is fast and can improve performance especially if don’t need to save the data, but there are limitations and one of limitation I didn’t like was I can’t detach and attach the database recreating the log file.