This tutorial is aimed at those of you with some experience with SQL and who want to discover the capabilities offered by some of its more advanced implementations. One of the main differences between the highly restricted Jet SQL behind Microsoft Access, and the more powerful Transact SQL used in Microsoft SQL Server 7 and SQL Server 2000 is that, wherever you can use a table, you can usually also use a query, a view, a union, or any type of JOIN statement. This is in stark contrast to Jet SQL where the use of sub-queries is very limited, and where there are many constraints placed on the order in which JOIN statements are nested. As you proceed through the tutorial, you will see this point illustrated again and again in the examples, and you will also find a few other tips and tricks that might be useful when implementing an advanced database application.
Using Table And Field Aliases
Before proceeding
to detail some of the features of Transact SQL, we shall
quickly cover the use of aliases. As advanced SQL queries can
become quite complex, and the names of fields and tables quite
lengthy, it is often useful to be able to use aliases. Field
aliases can be assigned using the following three methods:
alias =
FieldName
FieldName AS
alias
FieldName
alias.
Table aliases, however, can only be assigned in
two ways:
Table
AS alias
Table
alias.
For Example:
SELECT FName = Del.FirstName, Del.LastName AS Surname,
DelReg.RegistrationTypeID
RegTypeID
FROM
Delegate AS Del INNER JOIN DelegateRegistration DelReg
ON Del.DelegateID = DelReg.DelegateID
Simple UNIONS
The results of two queries on
separate tables can be combined into a single, temporary table
using a simple UNION statement. Both queries must return the
same number and type of columns, and field aliases can only be
applied to the first - any applied to the second query are
ignored. By default, a UNION will remove duplicates, but this
behavior can be overcome by using UNION ALL. Only a single
ORDER BY clause can be given at the end of the query.
SELECT FirstName FName, LastName
LName
FROM
Delegate
UNION
SELECT FirstName
MyFirstName, LastName
MyLastName
FROM
NonDelegate
ORDER BY
FName, Lname
Joining Tables
A simple INNER JOIN returns a
temporary table consisting of all columns from the first table
followed by all columns from the second for those rows that
are "matched" by the criteria of the ON clause. (These
criteria are not restricted to just simple tests for equality;
they can be full Boolean expressions comparing multiple values
from both tables.) In the following example we select all the
fields from the Delegate and DelegateRegistration tables for
every delegate who has a record in both.
SELECT *
FROM Delegate
Del INNER JOIN DelegateRegistration DelReg
ON Del.DelegateID = DelReg.DelegateID
By switching to a LEFT JOIN, all rows from table 1 will appear in the result set, even if there is no "matching" row in table 2 - a NULL value being assigned to the relevant columns of those rows that were not matched. Similarly, a RIGHT JOIN forces all rows in table 2 to show up in the results, even if there is no matching record in table 1.
Finding The Difference Between Two
Table-Sources
Using SQL's capability of joining two
tables as described in the above section we are able to find
records that appear in one table but are absent from a second,
related one. The standard technique for doing this is to use a
LEFT JOIN, and filter out those records that have no match.
SELECT Delegate.FirstName,
Delegate.LastName
FROM
Delegate LEFT JOIN NonDelegate
ON (Delegate.FirstName = NonDelegate.FirstName
AND Delegate.LastName =
NonDelegate.LastName)
WHERE NonDelegate.FirstName IS NULL
Since you can use a table-expression in exactly the same way as a table, you can also use a JOIN statement to find the difference between a table and a query. The key thing to realize is that tables, JOIN statements and queries are all table-expressions, and as such can be treated like a table in every respect. As the following example illustrates, a query can even be assigned a table-alias.
SELECT Delegate.DelegateID, Delegate.FirstName,
Delegate.LastName
FROM Delegate LEFT JOIN
(SELECT DelegateID FROM
DelegateRegistration
WHERE RegistrationTypeID
= 2) Type2Dels
ON Delegate.DelegateID =
Type2Dels.DelegateID
WHERE Type2Dels.DelegateID IS NULL
Using Sub-Queries
Sub-queries can sometimes be
used as an alternative to a JOIN statement to make the query
more readable. The syntax for a sub-query is:
field operator
[ANY|ALL] (query)
Here we retrieve all delegates that
registered before May 14. The sub-query returns the
DelegateID's from the DelegateRegistration table for all those
delegates who registered before that date. Then this result
set is used as the criteria for selecting delegate information
from the Delegate table.
SELECT * FROM Delegate
WHERE DelegateID in
(SELECT DelegateID FROM DelegateRegistration
WHERE RegDate <
'05/14/2000')
UPDATE statements
The UPDATE statement can include a FROM
clause allowing you to retrieve fields referenced by the WHERE
clause of the SET clause. This is useful if you wish to update
records in one table based on the values of an associated
record in a second. This is illustrated in the following
example:
UPDATE Del
SET
Del.Status =
RegType.DisplayName
FROM (Delegate Del INNER JOIN DelegateRegistration DelReg
ON Del.DelegateID =
DelReg.DelegateID)
INNER JOIN RegistrationType RegType
ON
DelReg.RegistrationTypeID = RegType.RegistrationTypeID
WHERE
RegType.RegistrationTypeID = 2
A table row can only be updated with one record so, if the FROM clause returns more than one, then the first record to match the criteria is used. Ideally, the FROM clause should only return one record for each row.
DELETE Statements
As with the UPDATE
statement, the delete statement can also have a FROM clause to
delete a record based on the value of a related record in a
second table. Here we delete all unregistered delegates.
DELETE Delegate
FROM
Delegate LEFT JOIN DelegateRegistration
ON Delegate.DelegateID =
DelegateRegistration.DelegateID
WHERE RegistrationTypeID IS NULL
Using The GROUP BY Clause
The GROUP BY clause is
used to merge collections or groups of records into one. When
a group of records is collapsed in this way, we can specify
aggregate functions to apply to the fields of the individual
records to give us summary values for the group. Only the
summary values will be returned by the query - all of the data
specific to the individual rows is lost. Each field in the
SELECT list of a GROUP BY query must either appear in the
GROUP BY clause, or be used as the argument to an aggregate
function. The available aggregate functions are AVG, COUNT,
MAX, MIN, SUM, STDDEV, STDDEVP, VAR, VARP. The following
example will return a temporary table with a row for each
company containing its name and the number of its delegates
SELECT Company.CompanyName, Count(Delegate.DelegateID) AS
Number
FROM Company
INNER JOIN Delegate
ON Company.CompanyID =
Delegate.CompanyID
GROUP BY Company.CompanyName
Using The HAVING Clause
Just as a WHERE clause
can be used to filter the individual rows that are returned by
a standard query, so the HAVING clause can be used to filter
the group records that are returned by a GROUP BY query. The
following query returns all companies with more than one
delegate.
SELECT Company.CompanyName,
Count(Delegate.DelegateID)
FROM Company INNER JOIN Delegate
ON Company.CompanyID =
Delegate.CompanyID
GROUP BY
Company.CompanyName
HAVING COUNT (Delegate.DelegateID) > 1
The HAVING clause can also be a full Boolean expression, but should utilize aggregate functions when referring to fields not in the GROUP BY clause.
Finding Duplicates
If we group records together
by certain identifying fields we can then use a Count function
to extract those that are duplicated. This query utilizes a
sub-query and, based on their first and last names, returns a
list of DelegateIDs for those delegates that have duplicate
records.
SELECT Delegate.DelegateID, Delegate.FirstName,
Delegate.LastName
FROM Delegate INNER JOIN
(SELECT FirstName, LastName
FROM Delegate
GROUP BY FirstName,
LastName
HAVING Count(DelegateID)
> 1
) AS Duplicates
ON Delegate.FirstName = Duplicates.FirstName
AND Delegate.LastName = Duplicates.LastName
Removing Duplicates
The following query removes
all duplicates from the delegate table, leaving only the
originals of the duplicated record. Here, we define a record
as having a duplicate if there is another record with matching
FirstName and LastName fields. This query relies on a view
being set up that returns a list of duplicate delegates. The
view is called DupDels.
DELETE FROM Delegate
WHERE DelegateID =
ANY
( SELECT DupDels.DelegateID
FROM DupDels LEFT JOIN
(SELECT Min(DelegateID)
AS DelegateID, FirstName, LastName
FROM DupDels GROUP BY FirstName, LastName) AS FirstDup
ON DupDels.DelegateID =
FirstDup.DelegateID
WHERE FirstDup.DelegateID IS NULL
)
Tricks With The IN Operator
The IN operator can
be very useful when you wish to select records with field
values that are equal to any one of several possible literal
values. Rather than having a series of comparisons linked
together by OR operators, a single IN clause can be used to
compare a field's value against a list of possibilities as
below:
SELECT * FROM Delegate
WHERE LastName IN ('Smith','Jones')
A nice trick when building such a query string from an HTML multiple-select list, is to use the VBScript or JavaScript Join function to merge all the selected values in the array into a comma-separated string. This string can then be used directly as the argument to an operator in the database query.
Inserting Rows From A Table-Source
In addition to
insert statements for inserting a single record that take the
following form:
INSERT INTO Table (field_list) VALUES (value_list)
We
can also insert the result returned by running a query as
below: (This also illustrates the use of literals in the
SELECT clause.)
INSERT INTO Delegate (FirstName, LastName, Status, CompanyID)
SELECT 'George',
LastName, 'George',
CompanyID
FROM
George
A Note About Indexing
For queries to run
efficiently it is important that appropriate indexes be in
place. A simple guideline is that all fields referenced in a
comparison in a WHERE clause or in the ON clause of a JOIN
statement should be indexed. Bear in mind that all primary key
fields are automatically indexed, but that foreign keys must
have an index added manually.
Well, there you have it. Now that you have witnessed some of the awesome power of SQL, with its greater flexibility and increased capability, it is easy to see why it is the query language of choice for high-end database servers.