Retrieving Data

This chapter describes how to retrieve information from a Mimer SQL database. In a relational database, information retrieved from one or more source tables is returned in the form of a result table, also called a result set.

The statement used to retrieve information is SELECT.

The examples in this chapter are based on the example database included with the Mimer SQL distribution, see The Example Environment.

Simple Retrieval

The simplest retrievals fetch information from one table.

The general form of the simple SELECT statement is:

SELECT column-list FROM table_name WHERE condition;

 

The column-list specifies which columns to select, and the WHERE condition determines which rows to select. If no WHERE condition is specified, all rows are retrieved from the source table or view.

Examples of Simple Retrieval

Find the format identifiers and their meaning:

SELECT format_id, format

FROM formats;

Returns:

FORMAT_ID

FORMAT

1

Audio CD

2

Cassette

3

DVD Audio

4

Vinyl

5

Audio Cassette

6

Audio CD

7

Hardcover

8

Paperback

9

DVD Video

10

Video

Find the name and code for all countries that use Australian dollars (AUD).

SELECT country, code

FROM countries

WHERE currency_code = 'AUD';

Returns:

COUNTRY

CODE

Australia

AU

Cocos (Keeling) Islands

CC

Christmas Island

CX

Heard and McDonald Islands

HM

Kiribati

KI

Norfolk Island

NF

Nauru

NR

Tuvalu

TV

 

How to formulate selection conditions is described in detail in Selecting Specific Rows.

Result Order

The columns in the result table are ordered as they are written in the SELECT statement, irrespective of the ordering in the source table. For example:

SELECT format, format_id

FROM formats;

Returns:

FORMAT

FORMAT_ID

Audio CD

1

Cassette

2

DVD Audio

3

Vinyl

4

Audio Cassette

5

Audio CD

6

Hardcover

7

Paperback

8

DVD Video

9

Video

10

Quick Select

A shorthand form for selecting all columns from a table is:

SELECT * FROM table_name ...

 

In this case, the columns in the result table are ordered as they are defined in the source table.

Table Names

A table name in a SELECT statement may be qualified by the name of the schema to which the table belongs in the form schema.table.

Unqualified table names are implicitly qualified by the ident name of the current user.

The table name must be written in its qualified form if the name of the schema to which the table belongs differs from the name of the current user.

For example:

SELECT *

FROM mimer_store.formats;

Returns:

FORMAT_ID

FORMAT

CATEGORY_ID

DISPLAY_ORDER

1

Audio CD

1

20

2

Cassette

1

30

3

DVD Audio

1

10

4

Vinyl

1

40

5

Audio Cassette

2

40

6

Audio CD

2

30

7

Hardcover

2

10

8

Paperback

2

20

9

DVD Video

3

10

10

Video

3

20

Setting Column Labels

Columns in the result table normally have the same name as the corresponding columns in the source table.

By using an AS clause after the column name in the SELECT statement, the column in the result table can be given an alternative name.

AS clauses can be used for as many columns as required. A label may be up to 128 characters long, and follows the same syntax rules as column names, see the Mimer SQL Reference Manual, SQL Identifiers.

For example:

SELECT format AS format_name, category_id

FROM formats;

Returns:

FORMAT_NAME

CATEGORY_ID

Audio CD

1

Cassette

1

DVD Audio

1

Vinyl

1

Audio Cassette

2

Audio CD

2

Hardcover

2

Paperback

2

DVD Video

3

Video

3

Labels are particularly useful in queries that retrieve computed values, where the result table column is otherwise unnamed, see Retrieving Computed Values.

Eliminating Duplicate Values

The simple SELECT statement retrieves all rows which fulfill the selection conditions. The result may contain duplicate values.

For example:

SELECT category_id

FROM formats;

Returns:

CATEGORY_ID

1

1

1

1

2

2

2

2

3

3

By adding the keyword DISTINCT before the column list you can eliminate all duplicate rows from the result table.

The keyword DISTINCT may only be used once in a simple SELECT statement.

For example:

SELECT DISTINCT category_id

FROM formats;

Returns:

CATEGORY_ID

1

2

3

DISTINCT also eliminates duplicate rows containing null values, although technically null is not regarded as equal to null, see Handling Null Values.

If the selected columns include the whole primary key in the source table, the keyword DISTINCT is unnecessary, since all rows in the result table will be unique. Remember however that a view may contain duplicate rows, so that selecting all columns does not always guarantee that the result does not contain duplicate rows.

Selecting Specific Rows

Rows are selected in the SELECT statement according to the search condition in the WHERE clause. This condition relates column value(s) to expressions.

Comparison Conditions and WHERE

Comparison operators that may be used in the WHERE clause are:

 

Operator

Explanation

=

equal to

<>

not equal to

<

less than

<=

less than or equal to

>

greater than

>=

greater than or equal to

Comparisons can be combined in the search condition using the logical operators AND and OR, and reversed using NOT.

Each comparison must be expressed in full; for example

WHERE price >= 10.00 AND price <= 20.00

 

may not be expressed as

WHERE price >= 10.00 AND <= 20.00

Comparing Character Strings

Character strings are compared character by character from left to right.

If strings are of different lengths, the shorter is conceptually padded to the right with blanks before the comparison is made (i.e. character difference takes precedence over length difference).

The default collation for characters is an extended Latin1 character set as defined by ISO 8859-1, see the Mimer SQL Reference Manual, Character Sets for the exact sequence. Default collation for Unicode characters (national character data) is the UCS_BASIC collation.

For more information on collations, see the Mimer SQL User's Manual, Collations.

Retrieve the European Article Number (EAN), price and number in stock for all available items costing 100 euros and more:

SELECT ean_code, price, stock

FROM items

WHERE status = 'A'

  AND price >= 100.00;

Returns:

EAN_CODE

PRICE

STOCK

790051157548

115.98

14

790051155506

279.98

16

790051595920

227.98

10

Comparing Temporal Data

When stating conditions on temporal data in tables, datetime and interval literals can be specified. There are also the CURRENT_DATE, LOCALTIME, LOCALTIMESTAMP and BUILTIN.UTC_TIMESTAMP functionality which read the server clock and return that value.

If there is more than one occurrence of these pseudo literals in a statement the clock is only read once.

List the EAN and price for any items released on September 5, 1994:

SELECT ean_code, price

FROM items

WHERE release_date = DATE'1994-09-05';

Returns:

EAN_CODE

PRICE

9780001006041

7.00

Retrieve the EAN and price for any items with a release date in the future:

SELECT ean_code, price

FROM items

WHERE release_date > CURRENT_DATE;

Returns:

EAN_CODE

PRICE

7298976754871

13.98

7464376662256

15.98

9781990789861

13.99

9781993789639

6.99

For an example of interval literals, see Datetime Arithmetic and Functions.

Pattern Conditions

LIKE is used to search for character strings that match a specified pattern.

Patterns in the LIKE condition are written with wildcard characters (also called meta-characters):

 

Pattern

Explanation

_

(underscore) stands for any single character

%

stands for any sequence of zero or more characters

(Wildcards only have significance in LIKE predicates.)

Find all currencies whose names include the string “Islands”:

SELECT currency

FROM currencies

WHERE currency LIKE '%Islands%';

Returns:

CURRENCY

Falkland Islands Pounds

Cayman Islands Dollars

Solomon Islands Dollars

Find all formats whose names do not contain the string “Audio”:

SELECT format

FROM formats

WHERE format NOT LIKE '%Audio%';

Returns:

FORMAT

Cassette

Vinyl

Hardcover

Paperback

DVD Video

Video

 

Remember that character strings in SQL statements are always written within single quotation marks (' ').

A LIKE predicate where the pattern string does not contain any wildcard characters is essentially equivalent to a basic predicate using the '=' operator, except that comparison strings in an '='comparison are conceptually padded with blanks whereas those in the LIKE comparison are not.

For example:

'Dollars ' =    'Dollars' is true

'Dollars ' LIKE 'Dollars ' is true

'Dollars ' LIKE 'Dollars%' is true

 

but

'Dollars ' LIKE 'Dollars' is false

 

The LIKE predicate may include an ESCAPE clause defining a character which is used to ‘escape’ wildcard characters. A wildcard character immediately following an escape character is taken at face value. See the Mimer SQL Reference Manual, The LIKE Predicate, for more details.

More about Searching for Character Strings

Some other examples of searching for character strings are:

LIKE '%P%'

matches any string that contains an upper-case ‘P’.

LIKE '%P%' COLLATE english_1

matches any string that contains an upper or lower case ‘P’.

LIKE '_abc'

matches any four letter character string ending with lower-case ‘abc’.

LIKE '%A\%' ESCAPE '\'

matches any string ending with ‘A%’.

LIKE 'D_d_'

matches any four letter string with D and d in the first and third positions respectively. Examples of possible values: Dude, Dads.

Set Conditions

IN and NOT IN

The operator IN finds the values that are contained in a set of values. The set is given as a comma-separated list enclosed in parentheses.

NOT IN finds values which are not contained in the specified set.

Which currencies are represented by the codes “SEK” or “GBP”?

SELECT currency

FROM currencies

WHERE code IN ('SEK', 'GBP');

Returns:

CURRENCY

Pounds Sterling

Swedish Kronor

List all the formats other than those for identifiers 1, 5 and 7:

SELECT format, format_id

FROM formats

WHERE format_id NOT IN (1, 5, 7);

Returns:

FORMAT

FORMAT_ID

Cassette

2

DVD Audio

3

Vinyl

4

Audio CD

6

Paperback

8

DVD Video

9

Video

10

Note:NOT IN is undefined if the subselect’s result contains a null value. E.g. SELECT * FROM tab WHERE 1 NOT IN (3, <null>, 4) will return an empty result set.

BETWEEN and NOT BETWEEN

The operators BETWEEN and NOT BETWEEN are used to find values that are within or outside an interval. The interval includes the limits specified in the BETWEEN condition.

Find the EAN and release date for EANs outside the “Bookland” range (e.g. 978 pre­fix) that were released during January 1998:

SELECT ean_code, release_date

FROM items

WHERE ean_code NOT BETWEEN 9780000000000 AND 9789999999999

  AND release_date BETWEEN DATE'1998-01-01' AND DATE'1998-01-31';

Returns:

EAN_CODE

RELEASE_DATE

90431587720

1998-01-05

93624662426

1998-01-13

45778040629

1998-01-20

BETWEEN may also be used for character comparisons.

For example:

SELECT code, country

FROM countries

WHERE country BETWEEN 'South Africa' AND 'Suriname';

Returns:

CODE

CURRENCY

ES

Spain

LK

Sri Lanka

SD

Sudan

SR

Suriname

ZA

South Africa

Find which currencies have an exchange rate in the range of 1 to 2:

SELECT currency

FROM currencies

WHERE exchange_rate BETWEEN 1.00 AND 2.00;

Returns:

CURRENCY

Netherlands Antillian Guilders

Australian Dollars

Aruban Guilders

Convertible Marka

Barbados Dollars

Leva

BETWEEN SYMMETRIC

BETWEEN has the SYMMETRIC option which is used to verify the interval’s lower and upper limits in both directions. This is especially useful when writing queries where the BETWEEN limit values are not defined until run-time, or where the limits are column or function references.

BETWEEN SYMMETRIC example with host variables:

SELECT code, country

FROM countries

WHERE country BETWEEN SYMMETRIC :country1 AND :country2;

country1: Suriname

country2: South Africa

Returns:

CODE

CURRENCY

ES

Spain

LK

Sri Lanka

SD

Sudan

SR

Suriname

ZA

South Africa

Without SYMMETRIC specified, this query would have returned an empty result set.

BETWEEN SYMMETRIC example with columns as arguments:

SELECT *

FROM table1 JOIN table2

ON table1.col1 BETWEEN SYMMETRIC table2.col1 AND table2.col2;

Retrieving Computed Values

You can retrieve computed values by using arithmetic, string and boolean operators in the SELECT clause of the statement.

When retrieving computed values, parentheses can be used to force the operation priority.

Without parentheses, the normal precedence rules apply. See Mimer SQL Reference Manual, Operator Precedence, for information regarding operator precedence.

The following computational operators may be used:

Operator

Explanation

+

addition

-

subtraction

*

multiplication

/

division

||

string concatenation

See the Mimer SQL Reference Manual, SQL Syntax Elements, for information regarding the type and precision of the result of an arithmetic expression.

Show the exchange rate for the US Dollar if there was a -10% change:

SELECT exchange_rate, exchange_rate * 0.90

FROM  currencies

WHERE code = 'USD';

Returns:

EXCHANGE_RATE

 

0.8711

0.783990

Evaluating Boolean Expressions

Boolean expressions return a truth value (TRUE or FALSE), depending on the result of one or more boolean expressions.

Specify when the exchange rate is less than 1 or the currency code is “ALL”:

SELECT currency,

       exchange_rate < 1.0 or code = 'ALL'

FROM currencies;

Returns:

currency

 

UAE Dirhams

FALSE

Afghanis

FALSE

Leke

TRUE

Armenian Drams

-

Netherlands Antillian Guilders

FALSE

Kwanza

-

Argentine Pesos

TRUE

Australian Dollars

FALSE

Labels and Computed Values

The computed column is unnamed by default in the result table. A label may be used to provide a name.

For example:

SELECT exchange_rate, exchange_rate * 0.90 AS new_exchange_rate

FROM currencies

WHERE code = 'USD';

Returns:

EXCHANGE_RATE

NEW_EXCHANGE_RATE

0.8711

0.783990

Constant Values

A column may also be computed as a constant value, which adds an extra column to the result table.

For example:

SELECT exchange_rate, '10% reduction:',

       exchange_rate * 0.90 AS new_exchange_rate

FROM currencies

WHERE code = 'USD';

Returns:

EXCHANGE_RATE

 

NEW_EXCHANGE_RATE

0.8711

10% reduction:

0.783990

You may also retrieve a value computed using the values in two or more columns, providing that the data types are compatible.

Retrieve the currencies prefixed with the word “Currency:”:

SELECT 'Currency: ' || currency

FROM currencies

WHERE code LIKE 'A%';

Returns:

 

Currency: UAE Dirhams

Currency: Afghanis

Currency: Leke

Currency: Armenian Drams

Currency: Netherlands Antillian Guilders

Currency: Kwanza

Padding Concatenated Strings

For string concatenation, column values are padded with trailing blanks to the length of the column definition, if the column data type is fixed-length (CHARACTER or NATIONAL CHARACTER).

For example:

SELECT currency || 'Currency'

FROM currencies

WHERE code LIKE 'A%';

Returns:

 

UAE Dirhams                      Currency

Afghanis                         Currency

Leke                             Currency

Armenian Drams                   Currency

Netherlands Antillian Guilders   Currency

Kwanza                           Currency

(If the column data type is variable length, i.e. VARCHAR or NCHAR VARYING, no blank padding is performed.)

Using Scalar Functions

Scalar functions operate on expressions or on a single value received from a SELECT statement.

Some of the scalar functions available are:

Scalar function

Description

CHAR_LENGTH

returns the length of a string.

EXTRACT

returns a single field from a DATETIME or INTERVAL value.

LOWER

converts all upper case letters in a character string to lower case.

POSITION

returns the starting position of the first occurrence of a specified string expression, starting from the left, in the given character string.

SOUNDEX

returns a character string containing six digits which represents an encoding of the sound of the given character string.

SUBSTRING

extracts a substring from a given string, according to specified start position and length of the substring.

TRIM

removes leading and/or trailing instances of a specified character from a string.

UPPER

converts all lower case letters in a character string to upper case.

The complete list of scalar functions can be found in the Mimer SQL Reference Manual, Scalar Functions.

Examples of Scalar Functions

The following are examples that illustrate how the scalar functions may be used:

List all currencies that contain the letters “AU” in upper or lower case:

SELECT currency

FROM currencies

WHERE LOWER(currency) LIKE '%au%';

Returns:

CURRENCY

Australian Dollars

Mauritius Rupees

Saudi Riyals

Note:Alternatively, a case insensitive collation can be used to get the same result.

SELECT currency

FROM currencies

WHERE currency COLLATE english_1 LIKE '%au%';

Find the position of the first space character in the formats column:

SELECT format, POSITION(' ' IN format)

FROM formats;

Returns:

FORMAT

 

Audio CD

6

Cassette

0

DVD Audio

4

Vinyl

0

Audio Cassette

6

Audio CD

6

Hardcover

0

Paperback

0

DVD Video

4

Video

0

Append the word “Currency” to the currencies (without trailing spaces):

SELECT TRIM(TRAILING FROM currency) || ' Currency'

FROM currencies

WHERE code LIKE 'A%';

Returns:

 

UAE Dirhams Currency

Afghanis Currency

Leke Currency

Armenian Drams Currency

Netherlands Antillian Guilders Currency

Kwanza Currency

Remove both leading and trailing spaces from the currencies and convert to upper­case; and get the significant length (in characters):

SELECT UPPER(TRIM(currency)), CHAR_LENGTH(TRIM(currency))

FROM currencies;

Returns:

 

 

UAE DIRHAMS

11

AFGHANIS

8

LEKE

4

ARMENIAN DRAMS

14

NETHERLANDS ANTILLIAN GUILDERS

30

KWANZA

6

...

...

Find the country that sounds the same as “ASTRALYA”:

SELECT country

FROM countries

WHERE SOUNDEX(country) = SOUNDEX('astralya');

Returns:

COUNTRY

Australia

Extract the first 5 characters from each format:

SELECT SUBSTRING(format FROM 1 FOR 5)

FROM formats;

Returns:

FORMAT

Audio

Casse

DVD A

Vinyl

Audio

Audio

Hardc

Paper

DVD V

Video

Using the CASE Expression

With a CASE expression, it is possible to specify a conditional value. Depending on the result of one or more conditional expressions, the CASE expression can return different values.

The rules for CASE expressions are fully described in the Mimer SQL Reference Manual, CASE Expression.

Case Expression Examples

The following select statements presents two examples of how CASE expressions can be used.

Simple Case Expression

Give a textual description to the DISPLAY_ORDER column and display them in the numeric order:

SELECT category_id,

      CASE display_order

          WHEN 10 THEN 'FIRST'

          WHEN 20 THEN 'SECOND'

          WHEN 30 THEN 'THIRD'

          WHEN 40 THEN 'FOURTH'

          ELSE 'UNKNOWN'

      END,

      format

FROM formats

ORDER BY category_id, display_order, format;

Returns:

CATEGORY_ID

 

FORMAT

1

FIRST

DVD Audio

1

SECOND

Audio CD

1

THIRD

Cassette

1

FOURTH

Vinyl

2

FIRST

Hardcover

2

SECOND

Paperback

2

THIRD

Audio CD

2

FOURTH

Audio Cassette

3

FIRST

DVD Video

3

SECOND

Video

This form of a case expression is known as a simple case expression, in which an operand is compared to a list of values.

If there is a match in one of the when clauses, the result is the value to the right of the then clause.

If none of these matches, the value in the else clause is returned.

If there is no else clause in a case expression and no when clause matches, a null value is returned.

Case Expression

The other form of the case expression can be seen in the following example.

Display the word “UNKNOWN” if the EXCHANGE_RATE value is undefined (i.e. null); and display the word “PARITY” if the rate is 1 to 1; otherwise do not display anything:

SELECT currency,

      CASE

          WHEN exchange_rate IS NULL THEN 'UNKNOWN'

          WHEN exchange_rate = 1.0 THEN 'PARITY'

          ELSE ''

      END

FROM currencies

WHERE code LIKE 'A%';

Returns:

CURRENCY

 

UAE Dirhams

 

Afghanis

 

Leke

 

Armenian Drams

UNKNOWN

Netherlands Antillian Guilders

 

Kwanza

UNKNOWN

In this form it is possible that more than one of the when clauses evaluates to true, in which case the value in the first (from left) of the matching clauses is returned.

Using the CAST Specification

The CAST specification explicitly converts data of one data type to another data type.

Conversion between data types is allowed if the rules for assignment to the target data type are not violated. See the Mimer SQL Reference Manual, CAST Specification for more information.

Show the exchange rate for the US Dollar with a -10% change, force the result to four decimal places:

SELECT CAST(exchange_rate * 0.90 AS DECIMAL(12, 4))

FROM currencies

WHERE code = 'USD';

Returns:

 

0.7839

Retrieve the EAN, price and release date for any items released in 1987. Convert the release date (DATE'YYYY-MM-DD') to character with the format MM/DD/YY:

SELECT ean_code, price,

      SUBSTRING(CAST(release_date AS CHAR(26)) FROM 11 FOR 2)

       || '/' ||

      SUBSTRING(CAST(release_date AS CHAR(26)) FROM 14 FOR 2)

       || '/' ||

      SUBSTRING(CAST(release_date AS CHAR(26)) FROM 8 FOR 2) AS date

FROM items

WHERE EXTRACT(YEAR FROM release_date) = 1987;

Returns:

EAN_CODE

PRICE

DATE

9780006167242

4.99

04/30/87

9780002315432

15.99

06/15/87

 

Datetime Arithmetic and Functions

It is possible to use datetime and interval values in expressions to calculate new datetime and interval values.

Valid operations are:

addition or subtraction between an interval value and a datetime value

subtracting a datetime from another datetime value

adding or subtracting two interval values

multiplying or dividing an interval by a numerical value

The first of these operations yields a datetime value while the others result in an interval value.

Retrieve the EAN, price and the number of days to the release date for any items with a release date in the future:

SELECT ean_code, price,

      (release_date - CURRENT_DATE) DAY(3) AS days

FROM items

WHERE release_date > CURRENT_DATE;

Returns:

EAN_CODE

PRICE

DAYS

7298976754871

13.98

5

7464376662256

15.98

12

9781990789861

13.99

8

9781993789639

6.99

4

When taking the difference between two datetime values it is necessary to specify the type of the resulting interval.

It is also possible to specify the precision of the interval as shown in the above example; the default precision for day is 2.

Retrieve the EAN, price and the release date for any items with a release date in the next 100 hours:

SELECT ean_code, price, release_date

FROM items

WHERE CAST(release_date AS TIMESTAMP)

        BETWEEN LOCALTIMESTAMP

         AND LOCALTIMESTAMP + INTERVAL '100' HOUR(3);

Returns:

EAN_CODE

PRICE

RELEASE_DATE

9781993789639

6.99

2002-03-15

About Intervals

SQL distinguishes between YEAR-MONTH (long) intervals and DAY-TIME (short) intervals.

YEAR-MONTH intervals are: YEAR, MONTH and YEAR TO MONTH.

DAY-TIME intervals are: DAY, HOUR, MINUTE, SECOND, HOUR TO MINUTE, HOUR TO SECOND, MINUTE TO SECOND, DAY TO HOUR, DAY TO MINUTE and DAY TO SECOND.

Long intervals may only be compared to other long intervals, and short intervals may only be compared to other short intervals, i.e. short and long intervals are not comparable.

Extracting Values

It is possible to extract part of a datetime value with the EXTRACT function. The function returns a numeric value.

Extract the month and year for any items with a release date in the next 4 days:

SELECT CASE EXTRACT(MONTH FROM release_date)

          WHEN  1 THEN 'January'

          WHEN  2 THEN 'February'

          WHEN  3 THEN 'March'

          WHEN  4 THEN 'April'

          WHEN  5 THEN 'May'

          WHEN  6 THEN 'June'

          WHEN  7 THEN 'July'

          WHEN  8 THEN 'August'

          WHEN  9 THEN 'September'

          WHEN 10 THEN 'October'

          WHEN 11 THEN 'November'

          WHEN 12 THEN 'December'

      END

       || ' ' ||

      CAST(EXTRACT(YEAR FROM release_date) AS CHAR(4))

FROM items

WHERE release_date BETWEEN CURRENT_DATE

                      AND CURRENT_DATE + INTERVAL '4' DAY;

Returns:

 

March 2002

DAYOFWEEK

Another useful function is DAYOFWEEK which returns the day number within a week. Monday has the value 1 and Sunday has the value 7.

Find the release day for any items with a release date in the next 4 days:

SELECT CASE DAYOFWEEK(release_date)

          WHEN 1 THEN 'Monday'

          WHEN 2 THEN 'Tuesday'

          WHEN 3 THEN 'Wednesday'

          WHEN 4 THEN 'Thursday'

          WHEN 5 THEN 'Friday'

          WHEN 6 THEN 'Saturday'

          WHEN 7 THEN 'Sunday'

      END

FROM items

WHERE release_date BETWEEN CURRENT_DATE

                      AND CURRENT_DATE + INTERVAL '4' DAY;

Result:

 

Friday

Using Set Functions

The functions listed below can be used in the column list of the SELECT statement to retrieve the result of the function on a specified column.

Set function

Explanation

AVG

average of values (numerical columns only)

COUNT

number of rows

MAX

largest value

MIN

smallest value

SUM

sum of values (numerical columns only)

About Set Functions

Set functions in SELECT statements are applied to data in the result table, not in the source table.

Set functions return a single value for the whole table unless a GROUP BY clause is specified, see Grouped Set Functions – the GROUP BY Clause.

For all set functions, null values are eliminated from the column before the function is applied. The special form COUNT(*) counts the number of rows including null values.

The keywords ALL and DISTINCT may be used to qualify set functions. ALL gives a result based on all values including duplicates. DISTINCT eliminates duplicates before applying the function. If neither keyword is specified, duplicates are not removed.

Note:Set functions may not be used together with direct column references in the SELECT list (unless the SELECT statement includes a GROUP BY clause, see Grouped Set Functions – the GROUP BY Clause).

Example of Set Functions

The set functions are illustrated with results from the following table:

Note: A hyphen - indicates null.

SAMPLE

1.0

2.0

2.0

2.0

3.0

3.0

4.0

5.0

-

-

COUNT(SAMPLE)   8

COUNT(*)   10

COUNT(DISTINCT SAMPLE)   5

SUM(SAMPLE)   22.0

SUM(ALL SAMPLE)   22.0

SUM(DISTINCT SAMPLE)   15.0

AVG(SAMPLE)   2.75000000000

AVG(ALL SAMPLE)   2.75000000000

AVG(DISTINCT SAMPLE)   3.00000000000

MAX(SAMPLE)   5.0

MIN(SAMPLE)   1.0

 

Note:AVG(column) is equivalent to SUM(column)/COUNT(column). However, the expression SUM(column)/COUNT(*) will give a different answer if the column includes null values.

Thus, for the table above:

SUM(SAMPLE)/COUNT(SAMPLE)    2.75000000000   (22/8)

SUM(SAMPLE)/COUNT(*)         2.20000000000   (22/10)

More Set Functions Examples

Some further examples of set functions applied to the example database are given below.

How many rows are there in the CURRENCIES table?

SELECT COUNT(*)

FROM currencies;

How many currencies have a defined exchange rate (i.e. EXCHANGE_RATE is not null)?

SELECT COUNT(exchange_rate)

FROM currencies;

What is the average exchange rate?

SELECT AVG(exchange_rate)

FROM currencies;

Decimal Calculation

The AVG function returns an integer if the operand is an integer, and a decimal if the operand is decimal. To force decimal calculation of averages from an integer column, cast the column operand as decimal:

SELECT AVG(CAST(column AS DECIMAL)) ...

Grouped Set Functions – the GROUP BY Clause

Normally, set functions return a single value, calculated from the set of all values in the column or expression.

If the SELECT statement includes a GROUP BY clause, set functions will be applied to groups of values. Columns used for GROUP BY do not have to be included in the SELECT list.

Find the number of rows in each category within the FORMATS table:

SELECT category_id, COUNT(*)

FROM formats

GROUP BY category_id;

Returns:

CATEGORY_ID

 

1

4

2

4

3

2

Restrictions When Using GROUP BY

Using a GROUP BY clause places some restrictions on the SELECT statement.

Only constants, columns listed in the GROUP BY clause, and columns used as arguments to set functions may be included in the SELECT list.

A derived table can be used to overcome this restriction.

Find the number of released items, grouped by year, month and format:

select y, m, format, count(*)

from

(

   select extract(year from release_date) as y,

          extract(month from release_date) as m,

          format

   from product_details

)

group by y, m, format;

Null Values

For grouping purposes, null values are regarded as equivalent. Thus for the example table:

SAMPLE

1.0

2.0

2.0

2.0

3.0

3.0

4.0

5.0

-

-

The following statement:

SELECT sample, COUNT(*) as number

  ...

GROUP BY sample;

Returns:

SAMPLE

NUMBER

1.0

1

2.0

3

3.0

2

4.0

1

5.0

1

-

2

Selecting Groups – the HAVING Clause

The HAVING clause restricts the selection of groups in the same way that a WHERE clause restricts the selection of rows. However, in contrast to the WHERE clause, a HAVING clause may use a set function on the left-hand side of a comparison.

The HAVING clause is most often used together with a GROUP BY clause, but may also be used to impose selection conditions on a column derived from a set function.

Find the currency codes that are used by more than one country:

SELECT currency_code, COUNT(currency_code)

FROM countries

GROUP BY currency_code

HAVING COUNT(currency_code) > 1;

Returns:

CURRENCY_CODE

 

AUD

8

CHF

2

DKK

3

EUR

23

IDR

2

Ordering the Result Table

Strictly, the order of rows in a result table is undefined unless an ORDER BY clause is included in the SELECT statement.

Ascending or descending order may be specified; ascending order is the default.

Note:A SELECT statement without an ORDER BY clause may appear to give an ordered result in Mimer SQL, but you should include an ORDER BY clause if the ordering is important. Without the ORDER BY clause, a change in the database contents or updated Mimer SQL version can otherwise change the order.

Example

SELECT *

FROM formats

ORDER BY format DESC;

Returns:

FORMAT_ID

FORMAT

CATEGORY_ID

DISPLAY_ORDER

4

Vinyl

1

40

10

Video

3

20

8

Paperback

2

20

7

Hardcover

2

10

9

DVD Video

3

10

3

DVD Audio

1

10

2

Cassette

1

30

1

Audio CD

1

20

6

Audio CD

2

30

5

Audio Cassette

2

40

Ordering by More than One Column

More than one column may be specified in the ORDER BY clause.

Example

SELECT *

FROM formats

ORDER BY category_id, display_order;

Returns:

FORMAT_ID

FORMAT

CATEGORY_ID

DISPLAY_ORDER

3

DVD Audio

1

10

1

Audio CD

1

20

2

Cassette

1

30

4

Vinyl

1

40

7

Hardcover

2

10

8

Paperback

2

20

6

Audio CD

2

30

5

Audio Cassette

2

40

9

DVD Video

3

10

10

Video

3

20

Ordering by Set Function

To order a result table by a set function, the column in the result table is given a label and the label is used in the ORDER BY clause.

Example

SELECT category_id, MAX(display_order) AS maximum_value

FROM formats

GROUP BY category_id

ORDER BY maximum_value;

Returns:

 

CATEGORY_ID

MAXIMUM_VALUE

3

20

1

40

2

40

Ordering by a Computed Value

To order a result table by a computed value, place the computation in the ORDER BY clause.

Example

SELECT product

FROM products JOIN items ON products.product_id = items.product_id

WHERE format_id = 2

ORDER BY stock * price;

Returns:

 

PRODUCT

The Wild, the Innocent and the E Street Shuffle

Greatest Hits

On How Life Is

Snowed In

Christmas Portrait

Falling into You

LaTouché IV: Traditional Cajun Dancin' Music

Atlanta Homecoming

Born in the U.S.A.

Retrieving Data From More Than One Table

The examples presented up to now in this chapter have illustrated the essential features of simple SELECT statements with data retrieval from single tables. However, much of the power of SQL lies in the ability to perform joins through a single statement, i.e. to select data from two or more tables, using the search condition to link the tables in a meaningful way.

The Join Condition

In retrieving data from more than one table, the search condition or join condition specifies the way the tables are to be linked. For example:

List the product name in addition to the EAN and price:

SELECT product, ean_code, price

FROM items

    JOIN products ON items.product_id = products.product_id;

 

The join condition here is ITEMS.PRODUCT_ID = PRODUCTS.PRODUCT_ID, which relates the product identifier in table ITEMS (where codes are listed) to the product identifier in table PRODUCTS (where names are listed).

Returns:

PRODUCT

EAN_CODE

PRICE

100 Anos

77774238724

9.98

12 Golden Country Greats

75596190923

17.98

12 Super Exitos

724385487521

9.98

1492: Conquest of Paradise

75678243226

17.98

Conceptually, the join first establishes a table containing all combinations of the rows in PRODUCTS with the rows in ITEMS, then selects those rows in which the two PRODUCT_ID values are equal. See Conceptual Description of the Selection Process for a fuller description of the conceptual SELECT process.

This does not necessarily represent the order in which the operations are actually performed; the order of evaluation of a complex SELECT statement is determined by the SQL optimizer, regardless of the order in which the component clauses are written.

Cross Products

Without the join condition, the result is a cross product of the columns in the tables in question, containing all possible combinations of the selected columns, for example:

SELECT product, ean_code, price

FROM items CROSS JOIN products;

Returns:

PRODUCT

EAN_CODE

PRICE

'Murder In The Cathedral'

77774238724

9.98

'Murder In The Cathedral'

75596190923

17.98

'Murder In The Cathedral'

724385487521

9.98

'Murder In The Cathedral'

75678243226

17.98

It is easy to see that a carelessly formulated join query can produce a very large result table. Two tables of 100 rows each, for instance, give a cross product with 10,000 rows; three tables of 100 rows each give a cross product with 1,000,000 rows!

The risk of generating large (erroneous) result tables is particularly high in interactive SQL (e.g. when using Mimer BSQL), where queries are so easily written and submitted.

Simple Joins

In simple joins, all tables used in the join are listed in the FROM clause of the SELECT statement. This is in distinction to nested joins, where the search condition for one SELECT is expressed in terms of another SELECT, see Nested Selects.

Example

SELECT product, ean_code, price

FROM items

    JOIN products ON items.product_id = products.product_id;

SELECT *

The form SELECT * may be used in a join query, but since this selects all columns in the result set, at least one column is often duplicated (a join condition column).

Example

SELECT *

FROM items

    JOIN products ON items.product_id = products.product_id;

Returns:

Columns from ITEMS:

ITEM_ID

PRODUCT_ID

FORMAT_ID

RELEASE_DATE

STATUS

PRICE

STOCK

REORDER_LEVEL

EAN_CODE

PRODUCER_ID

IMAGE_ID

 

Columns from PRODUCTS:

PRODUCT

PRODUCT_ID

PRODUCT_SEARCH

Columns in the join query that are uniquely identified by the column name may be specified by name alone. Columns that have the same name in the joined tables must be qualified by their respective table names.

The same query as above, but only three columns are returned:

SELECT product, ean_code, price

FROM items

    JOIN products ON items.product_id = products.product_id;

Nesting Join Clauses

It is possible to nest join-clauses, for example:

List the category in addition to the EAN and price for any items released in Decem­ber 1996:

SELECT ean_code, price, category

FROM items

    JOIN formats ON items.format_id = formats.format_id

    JOIN categories ON categories.category_id = formats.category_id

WHERE release_date BETWEEN date'1996-12-01' AND date'1996-12-30';

Result:

 

EAN_CODE

PRICE

CATEGORY

 9780006498957

7.99

Books

724385487521

9.98

Music

731453076723

29.98

Music

53308951925

11.98

Music

Complex Search Conditions and Joins

A join query can join any number of tables using complex search conditions to select the relevant information from each table.

List the product for any items with a release date in the future along with the item price in both Swedish and Danish crowns (SEK and DKK respectively):

SELECT product,

      CAST(price * exchange_rate AS DECIMAL(12,2)) AS cost,

      currency

FROM items

JOIN products ON products.product_id = items.product_id

CROSS JOIN currencies

WHERE release_date > CURRENT_DATE

  AND currencies.code IN ('SEK', 'DKK')

ORDER BY product, currency;

Result:

PRODUCT

COST

CURRENCY

Greatest Hits

99.42

Danish Kronor

Greatest Hits

125.61

Swedish Kronor

Pieces Of Fish

113.64

Danish Kronor

Pieces Of Fish

143.58

Swedish Kronor

The Future Foretold

49.71

Danish Kronor

The Future Foretold

62.80

Swedish Kronor

The Sql Quiz Book

99.49

Danish Kronor

The Sql Quiz Book

125.70

Swedish Kronor

In formulating a search condition for a join query, it can help to write out the columns that would appear in a complete cross-product of the tables. The search condition is then formulated as though the query was a simple SELECT from the cross-product table.

Outer Joins

The joins in the previous sections were all inner joins. In an inner join between two tables, only rows that fulfill the join condition are present in the result.

An outer join, on the contrary, contains non-matching rows as well. The outer join has two options, LEFT and RIGHT.

Left Outer Join

Example

SELECT ean_code, release_date, producer

FROM items

LEFT OUTER JOIN producers

    ON items.producer_id = producers.producer_id

WHERE ean_code >= 800000000000

ORDER BY ean_code;

Result:

EAN_CODE

RELEASE_DATE

PRODUCER

800488327626

1998-08-11

Giants Of Jazz (Ita)

  801061007720

2000-10-31

Warp Records

4988002364947

1999-09-28   

-

4988011353147

1998-06-30

-

5013145800423

2000-03-14   

Mint / Cherry Red

5013929112322

1999-10-12   

Cherry Red

5014438710221

1994-12-27   

Receiver Records

5019317001728

1994-12-15   

Receiver Records

7157761806273

1996-01-18   

Status Records

In the example above all rows from the table to the left in the join clause, i.e. ITEMS, are present in the result; non-matching rows from the PRODUCERS table are filled with null values in the result.

Observe the difference in result for the next statement and the previous one.

SELECT ean_code, release_date, producer

FROM items

LEFT OUTER JOIN producers

    ON  items.producer_id = producers.producer_id

    AND ean_code >= 800000000000

ORDER BY ean_code;

Result:

EAN_CODE

RELEASE_DATE

PRODUCER

8811038120

1991-08-27

-

8811042127

1991-10-22

-

8811061326

1992-05-19

-

8811067021

1992-12-22

-

800488327626

1998-08-11

Giants Of Jazz (Ita)

  801061007720

2000-10-31

Warp Records

4988002364947

1999-09-28   

-

4988011353147

1998-06-30

-

5013145800423

2000-03-14   

Mint / Cherry Red

The reason is that conditions in the where clause are applied to the result of the join-clause and not to the joined tables as is the case with the conditions in the on-clause.

Right Outer Join

A right outer join will take all records from the table to the right in the join-clause.

Nesting Outer Joins

As with inner joins, it is possible to nest join-clauses. Nested joins can be of different types, i.e. both inner and outer joins.

The result of nested outer joins can be somewhat unexpected though, as it is the result of the first join-clause that is the left table in the next join, not the right table in the first join-clause.

Example

SELECT *

FROM tableA

    LEFT JOIN tableB ON tableA.id = tableB.id

    LEFT JOIN tableC ON tableA.id = tableC.id

 

This query does first perform tableA LEFT JOIN tableB. The result is then used as left table when performing LEFT JOIN tableC.

To make this query clearer, parentheses can be added as:

SELECT *

FROM (tableA LEFT JOIN tableB ON tableA.id = tableB.id)

  LEFT JOIN tableC ON tableA.id = tableC.id

Nested Selects

A form of SELECT, called a subselect, can be used in the search condition of a SELECT statement to form a nested query.

The main SELECT statement is then referred to as the outer select.

For example:

Select the products that have a release date in the future.

SELECT product

FROM products

WHERE product_id IN (SELECT product_id

                     FROM items

                     WHERE release_date > CURRENT_DATE);

Result:

PRODUCT

Greatest Hits

Pieces Of Fish

The Future Foretold

The Sql Quiz Book

To see how this works, evaluate the subselect first:

SELECT product_id

FROM items

WHERE release_date > CURRENT_DATE;

Result:

PRODUCT_ID

30206

30618

31082

31083

Then use the result of the subselect in the search condition of the outer select:

SELECT product

FROM products

WHERE product_id IN (30206, 30618, 31082, 31083);

Result:

PRODUCT

Greatest Hits

Pieces Of Fish

The Future Foretold

The Sql Quiz Book

Using Subselects

A subselect can be used in a search condition wherever the result of the subselect can provide the correct form of the data for the search condition.

Thus a subselect used with ‘=’ must give a single value as a result.

A subselect used with IN, ALL or ANY must give a set of single values, see Retrieval with ALL, ANY, SOME.

A subselect used with EXISTS may give any result, see Retrieving Data Using EXISTS and NOT EXISTS.

Examples:

WHERE column = (subselect)

WHERE column IN (subselect)

WHERE column = ALL (subselect)

WHERE column = ANY (subselect)

WHERE EXISTS (subselect)

 

The UNION, EXCEPT and INTERSECT operators can be used to combine two or more subselects in more complex statements, see Union, Except and Intersect Queries.

Nested Queries

Many nested queries can equally well be written as simple joins. For example:

Select the products that have a release date in the future.

SELECT product

FROM products

WHERE product_id IN (SELECT product_id

                     FROM items

                     WHERE release_date > CURRENT_DATE);

 

or alternatively

SELECT DISTINCT product

FROM products

JOIN items

    ON products.product_id = items.product_id

WHERE items.release_date > CURRENT_DATE;

 

Both these queries give exactly the same result. In most cases, the choice of which form to use is a matter of personal preference. Choose the form which you can understand most easily; the clearest formulation is least likely to cause problems.

Subselects in Queries

Queries may contain any number of subselects, for example:

List the producers (manufacturers) which have items that are more expensive than any of the items produced by Sony.

SELECT producer

FROM producers

WHERE producer_id IN

         (SELECT producer_id

          FROM items

          WHERE price >

                 (SELECT MAX(price)

                  FROM items

                  WHERE producer_id =

                          (SELECT producer_id

                           FROM producers

                           WHERE producer = 'SONY')));

 

Note the balanced parentheses for the nested levels.

It is particularly important at this level of complication to think carefully through the query to make sure that it is correctly formulated.

Often, writing some of the levels as simple joins can simplify the structure. The previous example may also be written:

SELECT DISTINCT producer

FROM producers

JOIN items

     ON producers.producer_id = items.producer_id

WHERE price > (SELECT MAX(price)

               FROM items

               JOIN producers

                   ON items.producer_id = producers.producer_id

               WHERE producer = 'SONY');

Correlation Names

A correlation name is a temporary name given to a table to represent a logical copy of the table within a query.

There are three uses for correlation names:

simplifying complex queries

joining a table to itself

outer references in subselects

Simplifying Complex Queries Using Correlation Names

Using short correlation names into complicated queries can make the query easier to write and understand, particularly when qualified table names are used:

SELECT mimer_store_music.artists.artist,

      mimer_store.product_details.*

FROM mimer_store.product_details

JOIN mimer_store_music.titles

    ON mimer_store.product_details.item_id =

       mimer_store_music.titles.item_id

JOIN mimer_store_music.artists

    ON mimer_store_music.artists.artist_id =

       mimer_store_music.titles.artist_id

ORDER BY mimer_store_music.artists.artist;

 

may be rewritten

SELECT art.artist, pdt.*

FROM mimer_store.product_details AS pdt

JOIN mimer_store_music.titles AS ttl

    ON pdt.item_id = ttl.item_id

JOIN mimer_store_music.artists AS art

    ON art.artist_id = ttl.artist_id

ORDER BY art.artist;

 

The keyword AS in the FROM clause may be omitted, but is recommended for clarity.

About Correlation Names

Correlation names are local to the query in which they are defined.

When a correlation name is introduced for a table name, all references to the table in the same query must use the correlation name.

The following expression is not accepted:

  ...

  FROM mimer_store.product_details AS pdt,

       mimer_store_music.titles AS ttl,

  ...

  WHERE ttl.item_id = mimer_store.product_details.item_id

Joining a Table with Itself Using a Correlation Name

Joining a table with itself allows you to compare information in a table with other information in the same table. This can be done with a correlation name.

Select all currencies with the same exchange rate:

SELECT c.currency, c.code, c.exchange_rate

FROM currencies AS c

JOIN currencies AS copy

    ON  c.exchange_rate = copy.exchange_rate

    AND c.currency <> copy.currency;

Result:

CURRENCY

CODE

EXCHANGE_RATE

Croatian Kuna

HRK

7.0820

Gourdes

HTQ

7.0820

Iraqi Dina

IQD

1551.0000

Uganda Shillings

UGX

1551.0000

Here, the table CURRENCIES is joined to a logical copy of itself called COPY.

The first search condition finds pairs of currencies with the same exchange rate, and the second eliminates 'pairs' with the same currency name. Without the second condition in the search criteria, all currencies would be selected!

Without correlation names, this kind of query cannot be formulated. The following query would select all the currencies from the table:

SELECT currency, code, exchange_rate

FROM currencies

WHERE currencies.exchange_rate = currencies.exchange_rate;

Outer References in Subselects Using Correlation Names

In some constructions using subselects, a subselect at a lower level may refer to a value in a table addressed at a higher level. This kind of reference is called an outer reference.

SELECT currency

FROM currencies

WHERE EXISTS (SELECT *

              FROM countries

              WHERE currency_code = currencies.code);

 

This kind of query processes the subselect for every row in the outer select, and the outer reference represents the value in the current outer select row. In descriptive terms, the query says 'For each row in CURRENCIES, select the CURRENCY column if there are rows in COUNTRIES containing the current CODE value'.

If the qualifying name in an outer reference is not unambiguous in the context of the subselect, a correlation name must be defined in the outer select.

A correlation name may always be used for clarity, as in the following example:

SELECT currency

FROM currencies AS c

WHERE EXISTS (SELECT *

             FROM countries

             WHERE currency_code = c.code);

Retrieving Data Using EXISTS and NOT EXISTS

EXISTS is used to check for the existence of some row or rows which satisfy a specified condition. EXISTS differs from the other operators in that it does not compare specific values; instead, it tests whether a set of values is empty or not. The set of values is specified as a subselect.

The subselect following the EXISTS clause most often uses of ‘SELECT *’ as opposed to ‘SELECT column-list’ since EXISTS only searches to see if the set of values addressed by the subselect is empty or not - a specified column is seldom relevant in the subquery.

EXISTS (subselect) is true if the result set of the subselect is not empty

NOT EXISTS (subselect) is true if the result set of the subselect is empty

SELECT statements with EXISTS almost always include an outer reference linking the subselect to the outer select.

Examples of EXISTS

Find all currencies that are used in the COUNTRIES table:

SELECT currency

FROM currencies AS c

WHERE EXISTS (SELECT *

              FROM countries

              WHERE currency_code = c.code);

 

Without the outer reference, the select becomes a conditional ‘all-or-nothing’ statement: perform the outer select if the subselect result is not empty, otherwise select nothing.

List all products where the producer (manufacturer) is not known:

SELECT product

FROM products AS p

WHERE EXISTS (SELECT *

              FROM items

              WHERE producer_id IS NULL

                AND product_id = p.product_id);

Examples of NOT EXISTS

The next example illustrates NOT EXISTS:

List all products where the producer (manufacturer) is not known:

SELECT product

FROM products

WHERE NOT EXISTS (SELECT *

                  FROM items

                  JOIN producers ON items.producer_id = producers.producer_id

                  WHERE product_id = products.product_id);

Result:

PRODUCT

Invictus

Middle Of Nowhere

Negated EXISTS

Negated EXISTS clauses must be handled with care. There are two semantic ‘opposites’ to EXISTS, with very different meanings:

WHERE EXISTS (SELECT *

              FROM artists

              WHERE artist = 'Enigma')

 

is true if at least one artist is called Enigma.

WHERE NOT EXISTS (SELECT *

                  FROM artists

                  WHERE artist = 'Enigma')

is true if no artist is called Enigma.

But

WHERE EXISTS (SELECT *

              FROM artists

              WHERE artist <> 'Enigma')

 

is true if at least one artist is not called Enigma.

WHERE NOT EXISTS (SELECT *

                  FROM artists

                  WHERE artist <> 'Enigma')

 

is true if no artist is not called Enigma, that is if every artist is called Enigma.

Retrieval with ALL, ANY, SOME

Subselects that return a set of values may be used in the quantified predicates ALL, ANY or SOME. Thus

WHERE PRICE < ALL (subselect)

 

selects rows where the price is less than every value returned by the subselect

WHERE PRICE < ANY (subselect)

 

selects rows where the price is less than at least one of the values returned by the subselect

Select countries that have an exchange rate of less than one:

SELECT country

FROM countries

WHERE currency_code <> ALL (SELECT code

                            FROM currencies

                            WHERE exchange_rate >= 1.0);

 

If the result of the subselect is an empty set, ALL evaluates to true, while ANY or SOME evaluates to false.

An alternative to using ALL, ANY or SOME in a value comparison against a general sub-select, is to use EXISTS or NOT EXISTS to see if values are returned by a sub-select which only selects for specific values. For example:

Select countries where the associated currency code contains the letter 'E' as the middle character in the code:

SELECT country

FROM countries

WHERE currency_code = ANY (SELECT code

                           FROM currencies

                           WHERE code LIKE '_E_');

is equivalent to:

SELECT country

FROM countries AS c

WHERE EXISTS (SELECT *

              FROM currencies

              WHERE code LIKE '_E_'

                AND code = c.currency_code);

Union, Except and Intersect Queries

The UNION, EXCEPT and INTERSECT operators combine the results of two select clauses.

UNION first merges the result tables specified by the separate selects and then eliminates duplicate rows from the merged set. (UNION ALL does not eliminate duplicate rows.)

EXCEPT takes the distinct rows from the first select and returns the rows that do not appear in the second select. (EXCEPT ALL does not eliminate duplicate rows.)

INTERSECT takes the results of two selects and returns only rows that appear in both selects, after removing duplicate rows from the final result set. (INTERSECT ALL does not eliminate duplicate rows.)

Columns which are merged by UNION, EXCEPT and INTERSECT must have compatible data types (numerical with numerical, character with character, etc).

Subselects addressing more than one result column are merged column by column in the order of selection. The number of columns addressed in each subselect must be the same.

The column names in the result of a UNION, EXCEPT or INTERSECT are taken from the names in the first subselect. Use labels in the first subselect to assign different column names to the result table.

In UNION, EXCEPT and INTERSECT queries, you may need to add an empty column so that columns not represented in both queries in the statement are retained in the result set. This is done by casting a null value to a matching data type.

Example

SELECT ean_code, release_date, producer

FROM items

INNER JOIN producers

    ON items.producer_id = producers.producer_id

UNION ALL

SELECT ean_code, release_date, CAST(NULL AS char)

FROM items

WHERE NOT EXISTS

    (SELECT * FROM producers

     WHERE items.producer_id = producers.producer_id)

UNION Examples

Select the different codes for currencies and countries that start with the letter 'D':

SELECT code

FROM currencies

WHERE code LIKE 'D%'

UNION

SELECT currency_code

FROM countries

WHERE country LIKE 'D%';

 

The result is obtained by merging the results of the two selects and eliminating duplicates:

SELECT code                   SELECT currency_code

FROM currencies               FROM currencies

WHERE code LIKE 'D%;          WHERE country LIKE 'D%';

 

CODE

 

CURRENCY_CODE

DJF

 

DJF

DKK

 

DKK

DOP

 

XCD

DZD

 

DOP

and the UNION gives the result table:

CODE

DJF

DKK

DOP

DZD

XCD

To retain duplicates in the result table, use UNION ALL in place of UNION, see the Mimer SQL Reference Manual, UNION or UNION ALL, for details.

Merge the codes and names of currencies where the code begins with 'D' with the codes and names of the countries where the country begins with 'D':

SELECT code, currency AS currency_or_country

FROM currencies

WHERE code LIKE 'D%'

UNION

SELECT currency_code, country

FROM countries

WHERE country LIKE 'D%'

ORDER BY code;

Result:

CODE

CURRENCY_OR_COUNTRY

DJF

Djibouti

DJF

Djibouti Francs

DKK

Danish Kronor

DKK

Denmark

DOP

Dominican Pesos

DOP

Dominican Republic

DZD

Algerian Dinars

XCD

Dominica

Find the lowest and highest exchange_rates:

Unions can be used to combine information from the same table.

SELECT 'Highest', MAX(exchange_rate) AS rate

FROM currencies

UNION ALL

SELECT 'Lowest', MIN(exchange_rate)

FROM currencies

ORDER BY rate;

Result:

 

RATE

Lowest

0.2644

Highest

1035000.0000

EXCEPT Example

Select the codes from currencies, except those that also are found in countries, starting with the letter 'D':

SELECT code

FROM currencies

WHERE code LIKE 'D%'

EXCEPT

SELECT currency_code

FROM countries

WHERE country LIKE 'D%';

 

The result is obtained by taking the first result and then remove the rows also found in the second select, and finally eliminating duplicates:

SELECT code                   SELECT currency_code

FROM currencies               FROM currencies

WHERE code LIKE 'D%;          WHERE country LIKE 'D%';

 

CODE

 

CURRENCY_CODE

DJF

 

DJF

DKK

 

DKK

DOP

 

XCD

DZD

 

DOP

and the EXCEPT gives the result table:

CODE

DZD

To retain duplicates in the result table, use EXCEPT ALL in place of EXCEPT, see the Mimer SQL Reference Manual, EXCEPT or EXCEPT ALL, for details.

INTERSECT Example

Select the codes from currencies and countries that exist in both tables, starting with the letter 'D':

SELECT code

FROM currencies

WHERE code LIKE 'D%'

INTERSECT

SELECT currency_code

FROM countries

WHERE country LIKE 'D%';

 

The result is obtained by taking the rows that are included in the first result and also in the second select, and finally eliminating duplicates:

SELECT code                   SELECT currency_code

FROM currencies               FROM currencies

WHERE code LIKE 'D%;          WHERE country LIKE 'D%';

 

CODE

 

CURRENCY_CODE

DJF

 

DJF

DKK

 

DKK

DOP

 

XCD

DZD

 

DOP

and the INTERSECT gives the result table:

CODE

DFJ

DKK

DOP

To retain duplicates in the result table, use INTERSECT ALL in place of INTERSECT, see the Mimer SQL Reference Manual, INTERSECT or INTERSECT ALL, for details.

Handling Null Values

Null values require special handling in SQL queries. Null represents an unknown value, and strictly speaking null is never equal to null. (Null values are however treated as equal for the purposes of GROUP BY, DISTINCT and UNION, EXCEPT and INTERSECT).

Searching for null

The condition for selecting null values is

WHERE column IS NULL

 

The negated form (WHERE column IS NOT NULL) selects values which are not null (i.e. values which are known).

List all currencies, and their codes, where the exchange rate is not known:

SELECT currency, code

FROM currencies

WHERE exchange_rate IS NULL;

Result:

CURRENCY

CODE

Armenian Drams

AMD

Kwanza

AOA

Brunei Dollars

BND

Francs Congolais

CDF

Saint Helena Pounds

SHP

Somali Shillings

SOS

Somoni

TJS

List all EAN codes where the producer is not known:

SELECT ean_code

FROM items

WHERE producer_id IS NULL;

Result:

EAN_CODE

4988002364947

4988011353147

List all EAN codes issued to Llewellyn Publications, where the release date is not known:

SELECT ean_code

FROM items

WHERE release_date IS NULL

AND PRODUCER_ID IN (SELECT producer_id

                    FROM producers

                    WHERE producer = 'Llewellyn Publications');

Result:

EAN_CODE

9780875428697

9780875428949

9780875428260

9780875428680

9780875427386

Null values in ALL, ANY, IN and EXISTS Queries

Null values should be treated cautiously, particularly in ALL, ANY, IN and EXISTS queries.

The result of a comparison involving null is unknown, which is generally treated as false. This can lead to unexpected results.

For example, neither of the following conditions are true:

<null> IN (...,null,...)

<null> NOT IN (...,null,...)

 

The first result is almost intuitive: since null is not equal to null, null is not a member of a set containing null.

But if null is not a member of a set containing null, the second result is intuitively true.

In fact, neither result is true or false: both are unknown. If null values are involved on either side of the comparison, IN and NOT IN are not complementary. Similar arguments apply to queries containing ALL or ANY, for example:

SELECT currency, code

FROM currencies

WHERE exchange_rate > ALL (SELECT exchange_rate

                           FROM currencies

                           WHERE currency LIKE 'D%');

Result:

CURRENCY

CODE

Belarussian Rubles

BYR

Maticais

MZM

Lei

ROL

Turkish Liras

TRL

This query works as long as there are no null values returned by the subselect. But perform the subselect against a range of currencies that contain a null value in the exchange rate, and the query results in an empty set:

SELECT currency, code

FROM currencies

WHERE exchange_rate > ALL (SELECT exchange_rate

                           FROM currencies

                           WHERE currency LIKE 'A%');

 

Moreover, the reverse query, currencies that have a lower exchange rate, also results in an empty set. A justification for this is that as long as an exchange rate is unknown, it is impossible to say whether other currency rates are greater or less.

Using Exists

It is always possible to rephrase a query using ALL, ANY or IN in terms of one using EXISTS (with an outer reference between the selection and the EXISTS condition). This is to be recommended if the null indicator is to be permitted in the comparison sets, since null handling is then written out explicitly in the query.

Distinctions between queries involving null comparisons are subtle and are easily overlooked.

It is essential that the aim of a query is stringently defined before the query is formulated in SQL, and that the possible effects of null values in the search condition are considered.

There are many real-life examples where the presence of null has resulted in unforeseen and sometimes misleading data retrievals. It is advisable to define all columns in the database tables as NOT NULL except those where unknown values have a specific meaning. In this way the risks of confusion with null handling are minimized.

Conceptual Description of the Selection Process

This section presents a conceptual step-by-step analysis of the evaluation of a SELECT statement.

It is intended as an aid in formulating complex SELECT statements, and can also help you in understanding details of the statement syntax.

Note:The description here is purely conceptual. It does not represent the actual sequence of events performed by the database manager. In particular, the computer resource requirements implied by the intermediate result set defined in a FROM clause do not necessarily reflect actual requirements.

Query Used

The query used in the analysis is:

List those producers and the average price for the goods that they manufacture where they make more than 10 items. Sort the result by the average price, with the largest first:

SELECT producer, AVG(price) AS average

FROM producers AS p

JOIN items AS i

     ON p.producer_id = i.producer_id

GROUP BY p.producer

HAVING COUNT(*) > 10

ORDER BY average DESC, producer

Result:

PRODUCER

AVERAGE

BBC Audio (Spoken Word)

37.742727272727

MCA

27.798181818181

RCA

19.580000000000

Elektra/Asylum

18.265714285714

Warner Brothers

17.137894736842

Capitol

16.646666666666

Atlantic

14.798181818181

Sony

14.091111111111

Selection Process

Step 1Subselects at the lowest nesting level are evaluated first.

The first step in evaluating a select is to resolve subselects from the lowest level up, and conceptually replace the subselect with the result set. The example here does not use a nested select.

When all subselects are resolved, a, possibly complicated, single-level SELECT statement remains.

Step 2The FROM clause defines an intermediate result set.

Tables addressed in the FROM clause are combined to form an intermediate result set which is the full cross product of the tables.

The cross product is a table with one column for each column in each of the table, and one row for every combination of rows from the different tables.

The columns in the result set are identified by the qualified column names from the table from which they are derived.

FROM producers AS p JOIN items AS i

 

The FROM clause in the example produces an intermediate result set which is the full cross product of the PRODUCERS table and the ITEMS table.

Step 3The ON clause selects rows from the intermediate set.

The ON clause selects rows from the full cross product result set that meet the JOIN criteria specified.

ON p.producer_id = i.producer_id

 

In this example the ON clause selects only those result set rows where the value in the PRODUCER_ID column from the PRODUCERS table is equal to that in the PRODUCER_ID column from the ITEMS table.

The GROUP BY clause groups the remaining result set:

GROUP BY p.producer

 

PRODUCER

PRICE

404 Music Group

16.98

4AD Records

11.98

7-N Music

16.98

A&M Records

11.98

A&M Records

22.98

A&M Records

10.98

A&M Records

18.98

A&M Records

18.98

Step 4The HAVING clause selects groups:

HAVING COUNT(*) > 10

 

PRODUCER

PRICE

Atlantic

17.98

Atlantic

11.98

Atlantic

11.98

Atlantic

9.98

Atlantic

11.98

Atlantic

17.98

Atlantic

11.98

Atlantic

11.98

Step 5The SELECT list selects columns, evaluates any expressions in the SELECT list, and reduces groups to single rows if set functions are used:

SELECT producer, AVG(price) AS average

 

PRODUCER

AVERAGE

Atlantic

14.798181818181

BBC Audio (Spoken Word)

37.742727272727

Capitol

16.646666666666

Collins

7.529814814814

Elektra/Asylum

18.265714285714

Geffen Records

12.480000000000

HarperCollins

6.722187500000

Marshall Editions

9.842222222222

Step 6The results of subselects joined by UNION, EXCEPT and INTERSECT are merged.

This example does not include a UNION, EXCEPT or INTERSECT.

Step 7The final result is sorted according to the ORDER BY clause:

ORDER BY average DESC, producer;

 

PRODUCER

AVERAGE

BBC Audio (Spoken Word)

37.742727272727

MCA

27.798181818181

RCA

19.580000000000

Elektra/Asylum

18.265714285714

Warner Brothers

17.137894736842

Capitol

16.646666666666

Atlantic

14.798181818181

Sony

14.091111111111