Data

So what is parallelism?

In very simple terms, if you have a certain workload running on a server, the workload typically would be to pull data out of storage, do some stuff with it and then send result back. Parallelism is simply splitting this workload and have different CPU cores "do the stuff" on this data simultaneously together .

By doing this you in theory reduce the amount of time needed to return the result

This is a good thing, why "in theory"?

To understand this, you need to understand this does come with a cost. Let's break down what happens before you get the result in case you run in parallel and in normal cases.

No Parallelism	Parallelism
you run a query SQL receives the query and compile an execution plan. SQL pulls the data needed SQL decides what computation is required and sends instructions to processor Processor does stuff and returns results SQL returns results back to you	you run a query SQL receives the query and compile an execution plan. SQL pulls the data needed SQL splits the data into chunks SQL decides what computation is required and sends instructions to processors Processors do stuff and returns results SQL rejoins the data together to form one result set SQL returns results back to you

One key factor here is if the extra steps in parallelism for splitting and rejoining the data take more time\effort than the effort saved by splitting the work load then parallelism is not for you.

There are other key factors here such as how you deal with lots of tiny queries that end up using lots of threads of which you have a finite number but we'll focus on the cost for now.

So surely SQL can make the right judgement?

The simple answer is yes, but it needs a hand. The way SQL decides if and how to split the load is determined by two main settings: maximum degree of parallelism (MAXDOP) and cost threshold for parallelism.

Maximum degree of parallelism: tells SQL how many processors it can use at maximum when splitting any workload while cost threshold for parallelism tells SQL at which point it should decide that a given workload is a good candidate for parallelism.

These are server settings which can be applied server wide, they are by default set horribly wrong, unless you decided to convert your coffee coaster into a SQL server.

Cost threshold for parallelism

Cost threshold for parallelism is measured in units of cost. The magic behind it was initially established by a guy at Microsoft who did an estimate cost of time the query run and basically set 1 second to be one unit. He then worked out that anything above 5 should be a good candidate for running the query in parallel. Now lots of time has passed since then, processors have become a lot better and just that measure is no longer "ok" to use. However, SQL server still ships with that unit of 5 as the default. It no longer means 5 seconds but just 5 units.

This is still useful as a guide, however the 5 units is no longer what they used to be resulting in lots of "cheap" queries being determined as good candidates for parallelism. Now remember that if the query is cheap (i.e. needs less processing) you quickly would fall into the trap of the cost of splitting and joining it outweighing the benefit of parallel running.

MAXDOP

Now this might be a bit controversial, but SQL was never designed to run on parallel, it has been designed at a time where a core would cost you a tad under what it would cost you to buy a house, albeit not a very nice house. As such it's ability to deal with multiple processors is be

So how would I notice this?

No simple answers to this but there can be a variety of reasons that may point to an issue with over parallelism:

THREADPOOL waits: This can point to a situation where you have lots of tiny queries running in parallel, what's happening here is each of them is using lots of threads and you simply are running out of available threads and SQL just sits there waiting for an available thread.
SOS_SCHEDULER_YIELD waits: Very similar to the above but here what happened is the threads are locking each other resulting in threads being put on hold. An addition from Erik Darling is you'll also witness lots of CXPACKET waits and this screams to you that you need to revisit parallelism on your box

So where do I start?

Start by changing the default values now. As a finger up in the air figure start with 50 as your cost threshold for parallelism. Different opinions exists and there's no right answer by 50 is a good starting point. Just remember, it's a starting point. You'd need to do a lot more work to find your optimal setting for your environment.

For the MAXDOP the following article describes your optimal settings according to Microsoft: https://support.microsoft.com/en-gb/help/2806535/recommendations-and-guidelines-for-the-max-degree-of-parallelism-confi

Where do I go from here?

Some further readings: