Quantcast
Channel: CodeProject – Essential SQL
Viewing all articles
Browse latest Browse all 162

Using Conditional Logic in SQL with CASE Expression

$
0
0

The main purpose of a CASE expression is to return a value based on one or more conditional tests.  CASE expressions can be used anywhere in a SQL statement an expression is allowed.

Though a truly an expression, some people refer to them as “CASE statements.”  This most likely stems from that use of the phrase in programming languages.

The CASE expression is extremely versatile.  As such it is found throughout SQL.  In particular you find it being used in the SELECT column list, GROUP BY, HAVING, and ORDER BY clauses.

The CASE expression can be used to standardize (beautify) data or perform checks to protect against errors such as divide by zero.

All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database.  You can get started using these free tools using my Guide Getting Started Using SQL Server.

Using Conditional Logic in SQL with the SELECT CASE Statement (Expression)

There are two forms for the CASE clause:  simple and searched.  Both forms are used to return a result based on testing an expression.

The simple CASE expression is used for equality tests.  It tests one expression against multiple values, this makes it great for transforming one set of values, such as abbreviations to their corresponding long form.

The searched CASE expression uses a more comprehensive expression evaluation format.  It is good when you wish to work with ranges of data, such as salary ranges or ages.

We first start out with the simple form, and then cover searched.

CASE expression Simple Form

The simple form of the CASE expression is used to compare the results of an expression with a series of tests and return a “result” when the “test” returns true.

Here is the general form for a CASE expression in Simple Form

CASE expression
   WHEN test THEN result
   …
   ELSE otherResult
END

The ELSE statement is optional in a CASE expression.  When one is not defined, and no matches are made, NULL is returned.  If an ELSE is defined, and no matches are made, then “otherResult” is returned.

The ELSE clause is a great way to catch bad or unexpected data values, and return a result other than NULL.

There are several reasons to use a CASE clause.  The first is to transform data from one set of values to another.  For instance, if want to display an employee’s gender as “Male” or “Female,”  and your data is encoded as “M” or “F,” you can use a CASE expression to test for the single character representation and return its corresponding long form.

The example for this is:

SELECT JobTitle,
       CASE Gender
          WHEN 'M' THEN 'Male'
          WHEN 'F' THEN 'Female'
          ELSE 'Unknown Value'
       END
FROM   HumanResources.Employee

Similarly you can use a simple CASE clause to standardize several values into one.  Extending our example, suppose ‘0’ was used for Male and ‘1’ for female, then we could map all these variations to Male or Female as so:

SELECT JobTitle,
       CASE Gender
          WHEN 'M' THEN 'Male'
          WHEN '0' THEN 'Male'
          WHEN 'F' THEN 'Female'
          WHEN '1' THEN 'Female'
          ELSE 'Unknown Value'
       END
FROM   HumanResources.Employee

You may be wondering if you could just create another table in your database and use that to lookup the values.  I would tend to agree that would be the best, but in many situations you won’t have permission to create tables in the database.  In this case you’re left to your wits and the versatility the SELECT statement provides.

Here are some things to consider when using the simple CASE expression:

  • Only equality comparisons are allowed.
  • Tests are evaluated in the order defined.
  • The result for the first test whose value is TRUE is returned.
  • If no match is made, then a NULL is returned unless ELSE is defined.

CASE expression Searched Form

The search form of the CASE expression allows for more versatile testing.  It can be used to evaluate a greater range of tests.  In fact, any Boolean expression is allowed as a test.

A searched from CASE expression has this format

CASE
   WHEN booleanExpression THEN result
   …
   ELSE otherResult
END

With the searched form, one or more Boolean expressions are tested within WHEN clauses.  The result of the first TRUE Boolean expression is returned.

Below is the searched CASE form of the employee gender example  from the previous section.

SELECT JobTitle,
       CASE
          WHEN Gender = 'M' THEN 'Male'
          WHEN Gender = 'F' THEN 'Female'
          ELSE 'Unknown Value'
       END
FROM   HumanResources.Employee

As you can see it is the same example we used for the simple CASE expression.  I did this so you could see the subtle difference.  Notice that each WHEN clause now contains the test as a Boolean expression.

Here are the statements side-by-side:

Simple versus Searched CASE expression

Simple versus Searched CASE expression

I tend to use the searched CASE expression format in all my SQL.  This reason is simple, I only have to remember one format!

Since we’re testing Boolean expressions, the searched CASE statement isn’t limited to just equality tests.  This makes this form really good for comparing ranges of values.   Perhaps the sales manager of Adventure Works wants to organize products by price range.  How could this be done with SQL?

Given the following names and ranges provided by the sales manager, we can construct  a CASE expression to compare the ListPrice to a range of values and then return the appropriate price range name.

Price Range Definitions

The case statement is placed in the SELECT column list and returns a character value.  Here’s the SQL that does the trick:

SELECT Name,
       ListPrice,
       CASE
          WHEN ListPrice = 0 THEN 'No Price'
          WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
          WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
          WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
          WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
          WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
          WHEN ListPrice > 2000 THEN 'Luxury'
          ELSE 'UNLISTED'
       END as PriceRange
FROM   Production.Product

When you run this query you’ll see PriceRange listed and displaying values according to the ranges specified in the CASE expression:

Price Range Query Results

CASE Statement Results

CASE statements can also be used to help prevent errors.   A good example is to test for valid values within expressions such as when you divide numbers.

Consider

SELECT ProductID,
       Name,
       ProductNumber,
       StandardCost,
       ListPrice,
       StandardCost / ListPrice as CostOfGoodSold
FROM   Production.Product

This statement return the message

Divide by zero error encountered.

By using a CASE clause we can ensure we don’t inadvertently divide by zero.

SELECT ProductID,
       Name,
       ProductNumber,
       StandardCost,
       ListPrice,
       CASE
          WHEN ListPrice = 0 Then NULL
          ELSE StandardCost / ListPrice
       END as CostOfGoodSold
FROM   Production.Product

A CASE expression can be used wherever an expression can be used.  This means you can use it to return a column value result or even use it in an ORDER BY clause.

In the following section we’ll explore using CASE in the ORDER BY and GROUP BY clauses.

CASE expression in ORDER BY

Continuing on with the sales manager request, suppose she also wants to see the products sorted by price range and then product name.  We’ve seen how we can display the price ranges as a column, but how do we sort?

Actually it is pretty easy.  Since CASE is an expression, we can use it as once of the values from which order the results.  Remember, we aren’t limited to just sorting table columns, we can also sort an expression.

Here is the query to sort by the price range.

SELECT   Name,
         ListPrice
FROM     Production.Product
ORDER BY CASE
            WHEN ListPrice = 0 THEN 'No Price'
            WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
            WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
            WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
            WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
            WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
            WHEN ListPrice > 2000 THEN 'Luxury'
            ELSE 'UNLISTED'
         END,
         Name

We can then add CASE statement to SELECT list to also display the price range.

SELECT   Name,
         ListPrice,
         CASE
            WHEN ListPrice = 0 THEN 'No Price'
            WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
            WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
            WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
            WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
            WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
            WHEN ListPrice > 2000 THEN 'Luxury'
            ELSE 'UNLISTED'
         END as PriceRange
FROM     Production.Product
ORDER BY CASE
            WHEN ListPrice = 0 THEN 'No Price'
            WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
            WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
            WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
            WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
            WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
            WHEN ListPrice > 2000 THEN 'Luxury'
           ELSE 'UNLISTED'
        END,
        Name

As you can see, things start to get complicated.  Do you see how the CASE statement is repeated in both the SELECT list and ORDER BY?  Fortunately, we can simplify this a bit, but removing the CASE statement from the ORDER BY and replacing it with the SELECT list CASE expression’s alias name PriceRange as so:

SELECT   Name,
         ListPrice,
         CASE
            WHEN ListPrice = 0 THEN 'No Price'
            WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
            WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
            WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
            WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
            WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
            WHEN ListPrice > 2000 THEN 'Luxury'
            ELSE 'UNLISTED'
         END as PriceRange
FROM     Production.Product
ORDER BY PriceRange,
         Name

CASE expression in GROUP BY

Now that we’ve given the sales manager a detailed listing she wants to see summary data – doesn’t it ever end?  In my experience it doesn’t, so knowing lots of SQL to satiate customer demands is your key to success.

Anyways, the good news is we can use the CASE expression we’ve built to create summary groups.  In the following SQL we’re grouping the data by PriceRange.  Summary statistics on the minimum, maximum, and average ListPrice are created.

SELECT   CASE
            WHEN ListPrice = 0 THEN 'No Price'
            WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
            WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
            WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
            WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
            WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
            WHEN ListPrice > 2000 THEN 'Luxury'
            ELSE 'UNLISTED'
         END as PriceRange,
         Min(ListPrice) as MinPrice,
         Max(ListPrice) as MaxPrice,
         AVG(ListPrice) as AvgPrice,
         Count(ListPrice) as NumberOfProducts
FROM     Production.Product
GROUP BY CASE
            WHEN ListPrice = 0 THEN 'No Price'
            WHEN ListPrice > 0 AND ListPrice <= 50 THEN 'Low'
            WHEN ListPrice > 50 AND ListPrice <= 150 THEN 'Medium'
            WHEN ListPrice > 150 AND ListPrice <= 500 THEN 'High'
            WHEN ListPrice > 500 AND ListPrice <= 1000 THEN 'Very High'
            WHEN ListPrice > 1000 AND ListPrice <= 2000 THEN 'Highest'
            WHEN ListPrice > 2000 THEN 'Luxury'
            ELSE 'UNLISTED'
         END
ORDER BY MinPrice

Unlike the ORDER BY clause, we can’t reference the column alias PriceRange in the GROUP BY.  The entire CASE expression has to be repeated.  Here are the results of our query:

Results of CASE expression in GROUP BY

Results – CASE Expression in GROUP BY

 

Wrap Up

As you can see, using CASE expressions adds versatility to your SQL statements.  They not only allow you to transform data from one set of values to another, but can also be used to ensure statements don’t return errors.

Out of the two forms, simple and searched, I tend use the search form.  The reason is that the simple form is limited to equality tests; whereas, the searched form can do that and more.

Since CASE expressions are expressions and not statements or clauses, they can be used where any other expression is used.  That mean you can use throughout the SELECT statement and elsewhere in SQL.

The post Using Conditional Logic in SQL with CASE Expression appeared first on Essential SQL.


Viewing all articles
Browse latest Browse all 162

Trending Articles