SQL Server chooses parallel plans based on the costing (there are also some other factors that should be met for the plan that it can go parallel). Sometimes serial plan is slightly cheaper than a parallel, so it is assumed to be faster and picked by the optimizer, however, because the costing model is just a model it is not always true (for a number of reasons, enlisted in Paul’s article below) and parallel plan runs much faster.
Sorting is one of the key operations in query processing. SQL Server can achieve sorting by either reading data in an ordered fashion, for example, performing ordered Rowstore index scan or performing an explicit sort. If we want to get sorted data from a Columnstore index, the only option is to perform a sort explicitly with a Sort operator in a query plan, because a Columnstore index has no particular order, at least at the moment of writing this post.
Columnstore indexes were first introduced in SQL Server 2012, and starting from this version, we got a new query execution mode, called Batch Mode. Batch Mode was originally designed for column store indexes as an execution technique optimized to deal with a large number of rows. Operators, running in a traditional Row Mode, process one row at a time, operators in a Batch Mode process one batch of rows at a time, where a batch is a portion of approximately 900 rows.
In 2012 and 2014 a Sort operator used to run in a Row Mode only, however, starting from SQL Server 2016 the Sort for a Batch Mode was also implemented. In this post, we are going to look at some Batch Mode Sort peculiarities.
In this post, I continue exploration of SQL Server vNext and we will look at the nonclustered columnstore index updates.
Columnstore index has some internal structures to support updates. In 2014 it was a Delta Store – to accept new inserted rows (when there will be enough rows in delta store, server compresses it and switchs to Columnstore row groups) and a Deleted Bitmap to handle deleted rows. In 2016 there are more internal structures, Mapping Index for a clustered Columnstore index to maintain secondary nonclustered indexes and a deleted buffer to speed up deletes from a nonclustered Columnstore index.
Updates were always split into insert + delete. But that is now changed, if a row locates in a delta store, now inplace updates are possible. Another change is that it is now possible to have a per row (narrow) plan instead of per index (wide) plan.
Let’s make some experiments.
Not so long time SQL Server vNext was announced and issued as CTP. The most exciting announcement in that CTP was that SQL Server now supports Linux! This is awesome and I consider it to be great news for many people.
I am personally interested in the new features of query processing, and finally I had some time to install the vNext and dig a little bit into it. Currently it is CTP 1.2 available, and I will use this version for my experiments.
While exploring new extended events, I’ve found an interesting event compilation_stage_statistics and one of the columns of this event was trivial_plan_scanning_cs_index_discarded with the following description “Number of trivial plans discarded or could have been discarded which scan columnstore index”. That pushed me to do some investiagations of the topic.
Let’s try to make some experiments.
In this post, we will continue to look at the cardinality estimation changes in SQL Server 2016. This time we will talk about scalar UDF estimation. Scalar UDFs (sUDF) in SQL Server have quite bad performance and I encourage you try to avoid them in general, however, a lot of systems still use them.
Yesterday I came across a question on one of SQL forums, that I may rephrase like:
“Does a query plan compilation depend on how busy is SQL Server”.
Before we go further, I should explicitly mention that we talk about a Compiled plan, not an Executable plan. Plan execution will of course depend on how busy is server, for example, the query may wait for the memory grant to start execution, or execution may be slow because there are no cached pages in the Buffer Pool etc.
However, the question was about a Compiled plan: does the shape of a plan depend on the server load.
From the first glance it should not. But…
Most of the people knows about the so-called “Parameter Sniffing”. This topic was discussed in many aspects in a number of great articles. It is interesting that not only parameters might be “sniffed” during the first execution, but also a runtime constant functions. Let’s look at the example.
Two days ago, on the 24HOP Russia I was talking about the Query Processor internals and joins. Despite I had three hours, I felt the lack of time, and something left behind, because it is a huge topic, if you try to cover it in different aspects in details. With the few next articles, I’ll try to describe some interesting parts of my talk in more details. I will start with Hash Join execution internals.
This post is for the absolutely geeky people. For those that will not afraid when they hear: “query transformation rules”.
Sometimes people use nolock hint as a “turbo” button for their queries, assuming that not taking locks will speed up the query execution. There are many good articles describing all the dangerous moments of this approach, because of the read uncommitted isolation level. However, the focus of this article is a performance problem that you may encounter using nolock hint in some cases.