To OR or …?

Yeap, a year later, there we go, posting a quick blog again.

A colleague of mine skyped me about a query that was running slow. Well, slow means that it ran for 90 seconds, and then it timed out, that is.

Here is the little repro of the challenge:


USE [AdventureWorks2012]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[test](
[ProductID] [INT] NOT NULL,
[FirstDescription] [INT] NULL,
[SecondDescription] [INT] NULL,
CONSTRAINT [PK_ProductID_test] PRIMARY KEY CLUSTERED
(
[ProductID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

I know, I’m too lazy NOT to generate the table from SSMS :-).

Moving further, generate some data for the table – I’m regularly using SQL Data Generator. Again, lazy-lazy, I know…

Now, moving on to the actual code:


DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

SET STATISTICS IO, TIME ON

SELECT *
FROM [dbo].[test]
WHERE [FirstDescription] = 299002250 -- your own value here
OR FirstDescription IS NULL
AND SecondDescription = 84830679 -- your own value here;

I made sure I had a cold cache, then I simply selected all columns where the first description column has a particular value, or the first description column is NULL and the second column has a particular value.

Nothing magic here.

And the results from statistics and time being set to ON:


SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 2 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

(2 row(s) affected)
Table 'test'. Scan count 2, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 35 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

And now for the fun part, the execution plan:

execution plan 1

So we’re seeking the same index twice AND also “helping” it with a very expensive sort? Well, a picture worth a thousand words, right?

Any other way to re-write the query? How about:


SELECT *
FROM [dbo].[test]
WHERE [FirstDescription] = 299002250
UNION ALL
SELECT *
FROM [dbo].[test]
WHERE FirstDescription IS NULL
AND SecondDescription = 84830679;

So we replaced the OR with UNION ALL. I will spare you all the details, but I will show you the new execution plan:

execution plan 2

So we got rid of the expensive sort! How cool is that!

As a closing remark, I urge you to understand that this worked in this particular scenario. To clear the fog: if you are trying the same trick with multiple ORs and UNIONs on the same column, the result will favor the OR:


SELECT *
FROM [dbo].[test]
WHERE [FirstDescription] = 299002250
OR FirstDescription = 1684182851
OR FirstDescription = 364563112;

SELECT *
FROM [dbo].[test]
WHERE [FirstDescription] = 299002250
UNION ALL
SELECT *
FROM [dbo].[test]
WHERE FirstDescription = 1684182851
UNION ALL
SELECT *
FROM [dbo].[test]
WHERE FirstDescription = 364563112;

The execution plans, side by side:

execution plan 3

Of course, I barely scratched the surface here. Please feel free to do your own testing and let the rest of us know the results.

Best regards,
io

 

2 thoughts on “To OR or …?

  1. salut!

    pe exec plan apare un index la care nu ii gasesc definitia nicaieri in articol. Ma astept ca e un index pe toate coloanele pt ca altfel ar aparea un key lookup ( din cauza la select *).

    explicatia la distinct sort sa fie ca se returneaza 2 seturi de date cu duplicate din care se elimina duplicatele?
    daca e asa; echivalentul queryului ar fi un query cu union simplu si nu cu union all; caz in care cred ca se genereaza alt exec plan asemanator. pt ca daca este o inregistrare cu

    [FirstDescription] = 299002250 si si cu SecondDescription = 84830679; atunci o sa se returneze inregistrarea de 2 ori; primul select o returneaza doar odata.

    numai bine,
    Andrei

  2. ai dreptate cu indexul, acuma am vazut. testul a fost facut cu valori unice pentru cele 2 coloane. lucru nespecificat in blog. mersi mult de atentionari, o sa fiu ceva mai atent data viitoare – in general scriu noapte pe blog, pe apucate, precum vezi :-).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.