INTRODUCTION TO SQL

By: Helen M. Overland, 2004

What you will learn from this guide

In this guide, you will learn how to set up a test database, put data in, and get data out. You will also learn how to do basic updating and maintaining of your database. This guide is intended for those who have no prior experience with databases, but who have some familiarity with command-line interfaces (for example, MS-DOS) and Windows.

We will be using MySQL, which is an open source database management system. MySQL is commonly found in web applications. If you are planning to create or work with website applications (a guestbook or E-Commerce site for example), you will almost certainly come across MySQL at some time.

What you will need

For the purposes of this guide, you will need:
  1. A computer running Windows
  2. MySQL http://dev.mysql.com/downloads/mysql/4.1.html (download the "Windows Essentials (x86)" file)

What is SQL?

SQL (Structured Query Language) is a language designed to interact with an RDMS (Relational Database Management System). The RDMS is the program that actually controls the database (for example, MySQL, PostgreSQL, or Oracle). SQL is a declarative language, which means that it simply passes along what is wanted and lets the RDMS figure out how to fulfil the request. In general, SQL is intended to work across different RDMS platforms. However, there are some local differences. It is usually pronounced "es-Q-ell", or sometimes "sequel".


Setting up MySQL 4.1 on Windows 98/2000/XP

Installation
  1. Download MySQL. Save the file on the desktop
  2. Windows 2000/XP: Make sure you are logged in as an Administrator
  3. Double click the file to install
  4. Choose a "typical" installation
  5. When the installation is complete, click "Next >" to configure the server
  6. Choose all the defaults

    Setup
  7. Open up a command window:

    Start > Programs > MS-DOS Prompt
    Or
    Start > Run > "command"

  8. Go to the MySQL binary directory:

    cd c:\progra~1\mysql\mysqls~1.1\bin

  9. If it is not started, start MySQL

    Windows 98:

    mysqld

    Windows 2000/XP:

    net start MySQL

Update the root password:

Note: You will understand this better later. For now, just copy and paste the commands to change your password (changing "YOUR_NEW_PASSWORD" to your actual password).

mysql -u root mysql
update user
set password=password('YOUR_NEW_PASSWORD')
where user='root';
flush privileges;

Important: Don't forget to include password('') when setting a new password. Otherwise, you could lose all access to your database(s)!

exit


Great! MySQL is now set up. You don't need to reboot, but go ahead if it makes you feel better.


Working with MySQL:

To begin working with MySQL, enter the following command:

mysql -u root -p

What this tells MySQL: run MySQL, as user (-u) root, using a password (-p).

Enter your password at the prompt. You should now see something like this:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4 to server version: 4.1.11

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>

MySQL is now waiting for your commands.

All (our) MySQL commands end with a semicolon (;). If you forget to put the semicolon at the end of a command, MySQL will show you a prompt that looks like an arrow (->). You can enter the semicolon on this line. If you forget to close a quotation mark ('), the prompt will remind you by showing you that you need to close the quotation ('>). Enter the quotation and a semicolon. You will probably have to enter your command again without errors.

MySQL is not case sensitive, however, for the purposes of clearer instruction, commands are in uppercase in this guide.


Creating your database

Since MySQL comes with a database called 'test', we will create one called "mydb". Creating a database is easily done with the following command:

create database mydb;

Now that your database is created, we need to tell MySQL that we are going to work specifically with the mydb database. We do this by entering

use mydb;

Creating a table

Now, let's create our first table.
We will start by creating a SalesRep table. In this table, we will store some short information about our company sales reps.

CREATE TABLE salesrep(
	repnum int unsigned primary key auto_increment,
	fname varchar(30),
	lname varchar(50));

What this tells MySQL: Create a table called "salesrep". This table will have a primary key (a unique identifier for this record) called "repnum", which will be a number (specifically, an unsigned integer). Column "repnum" will automatically always insert new records with a value of 1 greater than the record inserted before (auto_increment). We also have a column for the rep's first name, which allows for 30 characters, and a column for their last name (50 characters).

Let's also create 2 more tables. You can copy and paste these commands if you like:

CREATE TABLE customer(
	custnum int unsigned primary key auto_increment,
	repnum int unsigned,
	fname varchar(30),
	lname varchar(50),
	state char(2));
CREATE TABLE sales(
	invnum int unsigned primary key auto_increment,
	custnum int unsigned,
	total decimal(7, 2),
	saledate timestamp default now());

Simple Insert statements

We are now ready to put data into our tables. First, let's add some sales reps.

INSERT INTO salesrep(fname, lname) VALUES ('Bob', 'Jones');

What this tells MySQL: create a new record in the table "salesrep", using the default "repnum" key. Set the first name to "Bob", and the last name to "Jones".

About Quotation Marks
The single quotation marks are important. Always enclose text entries in single quotes. You don't have to enclose numbers in quotes. If you need to insert a quote into the column, you can escape it with another quote.

For example
This query will cause an error (* Do not run this query at this time *):

insert into salesrep (fname, lname) values ('Bob', 'O'Grady');

But this query is fine (* Do not run this query at this time *):

insert into salesrep (fname, lname) values ('Bob', 'O''Grady');

Whenever you successfully complete a command in MySQL, you should see something that resembles the following:

Query OK, 1 row affected (0.11 sec)

Now, let's add a few more sales reps:

INSERT INTO salesrep(fname, lname)
	VALUES ('Jane', 'Smith');
INSERT INTO salesrep(fname, lname)
	VALUES ('Harry', 'Brown');
INSERT INTO salesrep(fname, lname)
	VALUES ('Carl', 'Brown');

Add some customers:

INSERT INTO customer(repnum, fname, lname, state)
	VALUES(1, 'Jill', 'Jackson', 'ny');
INSERT INTO customer(repnum, fname, lname, state)
	VALUES(2, 'Andrew', 'Fielding', 'mi');
INSERT INTO customer(repnum, fname, lname, state)
	VALUES(2, 'Harold', 'Stanton', 'ca');

And finally, a few sales:
insert into sales (custnum, total)
	values (1, 101.29);
insert into sales(custnum, total)
	values (2, 96.34);
insert into sales(custnum, total)
	values (1, 206.32);

Note: Because we defined the "saledate" column as a "timestamp" with a default of "now()", it will automatically insert the current date and time. As of MySQL 4.1, only ONE column in each table can be defined this way.

Simple Select Statements

Okay, we have data inserted. But what does it all mean? Let's take a look at the data to try to get a better picture of what is going on.

SELECT * FROM salesrep;

What this tells MySQL: show me everything (*) that is in the table salesrep;

You should get results that look like this:

mysql> SELECT * FROM salesrep;
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      1 | Bob   | Jones |
|      2 | Jane  | Smith |
|      3 | Harry | Brown |
|      4 | Carl  | Brown |
+--------+-------+-------+
4 rows in set (0.16 sec)

You can also select only specific columns, such as

mysql> SELECT fname, lname FROM salesrep;
+-------+-------+
| fname | lname |
+-------+-------+
| Bob   | Jones |
| Jane  | Smith |
| Harry | Brown |
| Carl  | Brown |
+-------+-------+
4 rows in set (0.04 sec)

Try selecting things from the other two tables yourself.


The WHERE Clause

Now, let's learn about the WHERE clause. This allows you to refine your search. For example:

mysql> SELECT * FROM salesrep WHERE lname = 'Brown';
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      3 | Harry | Brown |
|      4 | Carl  | Brown |
+--------+-------+-------+
2 rows in set (0.16 sec)

We can also make use of the Not Equals operator. This is normally either >< or !=.

mysql> SELECT * FROM salesrep WHERE lname != 'brown';
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      1 | Bob   | Jones |
|      2 | Jane  | Smith |
+--------+-------+-------+
2 rows in set (0.17 sec)


The AND & OR Operators

We can also use the AND & OR operators to further refine our searches.

Let's say we want to find out if Harry Brown has any relatives working at our company. We want the sales reps whose last name is Brown, but whose first name is not Harry. We can find out this way:

mysql> SELECT * FROM salesrep
	WHERE lname = 'brown'
		AND fname != 'harry';
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      4 | Carl  | Brown |
+--------+-------+-------+
1 row in set (0.28 sec)

Now, we've heard about these two exceptional sales reps called Bob and Jane. Let's look up both of them.

mysql> SELECT * FROM salesrep
	WHERE fname = 'bob'
		OR fname = 'jane';
+--------+-------+---------+
| repnum | fname | lname   |
+--------+-------+---------+
|      1 | Bob   | Jones   |
|      2 | Jane  | Smith |
+--------+-------+---------+
2 rows in set (0.00 sec)

You can also use the standard mathematical order of operations to refine your search.

SELECT * FROM salesrep
	WHERE (fname = 'Bob' OR fname = 'Jane')
		AND lname = 'Jones';

What this tells MySQL: select everything from salesrep where

1. the first name is "Bob" or "Jane"
AND
2. The last name is "Jones"

As you may guess, here is the result:

mysql> select * from salesrep
	where (fname = 'bob' or fname = 'jane')
		and lname = 'jones';
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      1 | Bob   | Jones |
+--------+-------+-------+
1 row in set (0.00 sec)

LIKE

Sometime you just don't know exactly what it is you are searching for. Let's say management wants to look up that sales rep "whose name starts with a 'J'". We can do this:

mysql> SELECT * FROM salesrep
	WHERE lname LIKE 'J%';
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      1 | Bob   | Jones |
+--------+-------+-------+
1 row in set (0.00 sec)

The '%' character is a wildcard. Here is how it works:

J% Find only matches that begin with "J"
%J Find only matches that end with "J"
%J% Find only matches that contain a "J" anywhere in the field
J Find matches where there is nothing but "J" in the field

Note: This feature can start to slow down your server if you are searching longer fields.


ORDER BY

Sometimes you may want to put your results in order. Doing so is easy:

mysql> SELECT * FROM salesrep
	ORDER BY lname;
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      3 | Harry | Brown |
|      4 | Carl  | Brown |
|      1 | Bob   | Jones |
|      2 | Jane  | Smith |
+--------+-------+-------+
4 rows in set (0.05 sec)

You can also order by ascending (ASC) or descending (DESC). Ascending is the default order, and you generally don't need to specify it. You can also order by more than one column. MySQL will order the results based on the first column you specify, and then the second, and so on.

mysql> select * from salesrep
	order by lname desc, fname;
+--------+-------+-------+
| repnum | fname | lname |
+--------+-------+-------+
|      2 | Jane  | Smith |
|      1 | Bob   | Jones |
|      4 | Carl  | Brown |
|      3 | Harry | Brown |
+--------+-------+-------+
4 rows in set (0.00 sec)

As you can see, the results are ordered first by the last name in descending order, and then by the first name in ascending order.

DISTINCT

Perhaps we want a list of all the last names of all our sales reps. Obviously, we don't want to see all the duplicates there might be.

SELECT DISTINCT lname FROM salesrep;

What this tells MySQL: show all the last names of all the sales reps only once.

This gives us:

mysql> select distinct lname from salesrep;
+-------+
| lname |
+-------+
| Jones |
| Smith |
| Brown |
+-------+
3 rows in set (0.06 sec)

Note: Each result has to be completely unique to be filtered out by DISTINCT. If you select more than one column, you will get all records that are not completely distinct from the other records. For example, if you also want to know the rep number of the last name you retrieve, you will get the following:

mysql> select distinct repnum, lname from salesrep;
+--------+-------+
| repnum | lname |
+--------+-------+
|      1 | Jones |
|      2 | Smith |
|      3 | Brown |
|      4 | Brown |
+--------+-------+
4 rows in set (0.05 sec)

Here, we are shown the last name of "Brown" twice because the repnum is different for each record, and therefore, the two "Brown" records are not DISTINCT. How do we solve this problem? With GROUP BY.

GROUP BY

GROUP BY will group each record by the field you specify. Here is how you would filter out that second "Brown" in the above query:

SELECT DISTINCT repnum, lname
	FROM salesrep
	GROUP BY lname;

What this tells MySQL: select each distinct repnum and last name, and group the results by the last name.

Your results are:

mysql> select distinct repnum, lname
	from salesrep
	group by lname;
+--------+-------+
| repnum | lname |
+--------+-------+
|      3 | Brown |
|      1 | Jones |
|      2 | Smith |
+--------+-------+
3 rows in set (0.00 sec)

You may notice that the results are ordered by the last name. GROUP BY automatically orders the results based on the specified group.

AS

You can choose what the columns will be called in your query results.

mysql> SELECT fname AS first_name FROM salesrep;
+------------+
| first_name |
+------------+
| Bob        |
| Jane       |
| Harry      |
| Carl       |
+------------+
4 rows in set (0.00 sec)


Aggregate Functions

There are a few functions built in to SQL to make your life easier. When you are building any application that will be used in high volume, you want to limit the number of times your program contacts the database and requests information. SQL is generally light and fast, and so you want it to do as much of the work as possible. Your program will not work as hard, and your users will not have to wait as long if you use SQL to do some of your processing for you. Here is an explaination of some of the more common functions.
COUNT()
If you just want to count how many records are in a table, use this function. It is much faster than requesting every record, and then looping through them to count how many came back (depending on your application language).

SELECT COUNT(*) FROM salesrep;

What this tells MySQL: count the number of rows in the table "salesrep";

For easier integration with your program, give a name to the column in your result:

mysql> SELECT COUNT(*) AS totalreps FROM salesrep;
+-----------+
| totalreps |
+-----------+
|         4 |
+-----------+
1 row in set (0.00 sec)

SUM()
This function will add the values of the specified column. Let's say we want to find out what our total company sales are.

Here is what our table looks like:

mysql> select * from sales;
+--------+---------+--------+---------------------+
| invnum | custnum | total  | saledate            |
+--------+---------+--------+---------------------+
|      1 |       1 | 101.29 | 2005-05-01 15:31:42 |
|      2 |       2 |  96.34 | 2005-05-01 15:32:38 |
|      3 |       1 | 206.32 | 2005-05-01 15:32:57 |
+--------+---------+--------+---------------------+
3 rows in set (0.00 sec)

And here is our total company sales:

mysql> SELECT SUM(total) AS totalsales FROM sales;
+------------+
| totalsales |
+------------+
|     403.95 |
+------------+
1 row in set (0.00 sec)

MIN()
This function will return the minimum value in the specified column. Perhaps we want to find out which customer has placed the smallest order with us:

mysql> SELECT MIN(total) AS smallorder FROM sales;
+------------+
| smallorder |
+------------+
|      96.34 |
+------------+
1 row in set (0.00 sec)

MAX()
As you may guess, this function will return the maximum value in the specified column. In this case, we want to find out which customer has placed the largest order:

mysql> SELECT MAX(total) AS bigorder FROM sales;
+----------+
| bigorder |
+----------+
|   206.32 |
+----------+
1 row in set (0.00 sec)

AVG()
This function will return the average value in the specified column. In this case, we want to find out what our average order size is.

mysql> SELECT AVG(total) AS averageorder FROM sales;
+--------------+
| averageorder |
+--------------+
|   134.650000 |
+--------------+
1 row in set (0.05 sec)


Simple Update Statements

It turns out that Jane is getting married, and so her last name has to be updated. Here is how we update her name, assuming we know that her RepNum is '2'.

UPDATE salesrep SET lname = 'Swanson' WHERE repnum = 2;

What this tells MySQL: Update the salesrep table. Set EVERY last name to "Swanson" where the repnum is already "2".

Note: if you exclude the WHERE clause in this query, you will update every single Sales Rep's last name to "Swanson".

A less reliable way of updating Jane's last name is to update based on her current last name.

UPDATE salesrep SET lname = 'Swanson' WHERE lname = 'Smith';

This is not such a good way of updating, as every sales rep with a last name of "Smith" will be updated with "Swanson". Using this method, if we were to update Harry Brown's last name to "Black", we would also update Carl Brown's name to Carl Black.

It is always better to use the Primary Key (in this case, repnum) to identify a record where possible.

Simple Delete Statements

It seems that Carl Brown just isn't doing too well as a sales rep, and he has left the company. Now we have been told to delete his employee record.

DELETE FROM salesrep WHERE repnum = 4;

What this tells MySQL: delete EVERY record where the repnum is 4.

Important: If you omit the WHERE clause in this query, you will delete each and every record from this table. The table will be empty. If you want to empty a table, you simply enter:

(* Do not run this query at this time *)
DELETE FROM salesrep;

There is no undoing a delete query; the data is gone for good. If you delete something accidentally, you can either add it back, or restore your database from a backup.

The primary key, which is (in this table) set to auto increment for every new record, will not be set back for the deleted records. If there are 10 records in a table, and you delete record #10, the next record to be inserted will have a primary key of 11. The table will be missing record #10.

You can, however, reset the auto_increment for the next record back to 10.

(* Do not run this query at this time *)
ALTER TABLE salesrep auto_increment = 10;

In an actual production environment, data is rarely deleted. All data is considered to be important, and can provide a history of changes and activity.

Simple Alter Statements

Since some sales people are doing better than others, management has decided to introduce different sales commission rates for different sales reps. This means we will have to add a column to the salesrep table for the commission rate. Each sales rep will start with a default commission rate of 5%.

Add a Column
ALTER TABLE salesrep
	ADD rate tinyint unsigned default 5;

What this tells MySQL: add a column to the "salesrep" table called "rate" that will be of data type "tinyint unsigned". For each record, use a default value of "5" (enter "5" unless we specify a different number).

Let's now select everything from our salesrep table so we can see how things look now:

mysql> SELECT * FROM salesrep;
+--------+-------+---------+------+
| repnum | fname | lname   | rate |
+--------+-------+---------+------+
|      1 | Bob   | Jones   |    5 |
|      2 | Jane  | Swanson |    5 |
|      3 | Harry | Brown   |    5 |
+--------+-------+---------+------+
3 rows in set (0.11 sec)


Modify a column

Management forgot to tell us that some of the new commission rates could have decimal places. Also, because the sales staff has complained about the new commission rates, the default commission rate will now be 5.25%. We will have to update our column.

ALTER TABLE salesrep
	MODIFY COLUMN rate decimal(4,2) default 5.25;

What this tells MySQL: Make changes to table "salesrep". Specifically, make changes to the "rate" column. Make the "rate" column a decimal field, with 2 decimal places, and 4 numbers in total (including the decimal places). Make the default value "5.25".

Let's get a listing of what's in our table again:

mysql> select * from salesrep;
+--------+-------+---------+------+
| repnum | fname | lname   | rate |
+--------+-------+---------+------+
|      1 | Bob   | Jones   | 5.00 |
|      2 | Jane  | Swanson | 5.00 |
|      3 | Harry | Brown   | 5.00 |
+--------+-------+---------+------+
3 rows in set (0.00 sec)

There is a problem here: each rate is still at 5%. Because the columns already contained a value, they were not updated with the new value. We have to manually update the rate for all sales reps. As we've already learned, any update statement that has no WHERE clause will update each and every record in the table.

UPDATE salesrep SET rate = 5.25;

Let's see our results:

mysql> select * from salesrep;
+--------+-------+---------+------+
| repnum | fname | lname   | rate |
+--------+-------+---------+------+
|      1 | Bob   | Jones   | 5.25 |
|      2 | Jane  | Swanson | 5.25 |
|      3 | Harry | Brown   | 5.25 |
+--------+-------+---------+------+
3 rows in set (0.00 sec)

Each new record that is inserted into the database will now have a rate of "5.25": Not because we updated the records with 5.25, but because we declared a default value of 5.25 if no other value is present.

Drop a column

Sometimes you may find you want to drop a column. Let's add a test column, and delete it again for the sake of demonstration.

Let's say management has decided to record the age of each sales rep:

ALTER TABLE salesrep
	ADD COLUMN age tinyint unsigned;

What this tells MySQL: Make changes to the table "salesrep". Specifically, add a column called "age" that allows for whole numbers.

What happened to our table:

mysql> select * from salesrep;
+--------+-------+---------+------+------+
| repnum | fname | lname   | rate | age  |
+--------+-------+---------+------+------+
|      1 | Bob   | Jones   | 5.25 | NULL |
|      2 | Jane  | Swanson | 5.25 | NULL |
|      3 | Harry | Brown   | 5.25 | NULL |
+--------+-------+---------+------+------+
3 rows in set (0.00 sec)

No, wait. Management has decided that this is not such a good idea after all. We'd better drop the table.

ALTER TABLE salesrep
	DROP COLUMN age;

What this tells MySQL: make changes to the table "salesrep". Specifically, drop the column called "age".
Our results:

mysql> select * from salesrep;
+--------+-------+---------+------+
| repnum | fname | lname   | rate |
+--------+-------+---------+------+
|      1 | Bob   | Jones   | 5.25 |
|      2 | Jane  | Swanson | 5.25 |
|      3 | Harry | Brown   | 5.25 |
+--------+-------+---------+------+
3 rows in set (0.00 sec)

When you drop a column, the change is permanent. If you drop a column accidentally, you can either add it back, or restore your database from a backup.

Finding out about your database

Now that we have made so many changes to our salesrep table, we might be starting to forget exactly how the table looks. We can get a description of the table by entering the following:

mysql> DESCRIBE salesrep;
+--------+------------------+------+-----+---------+----------------+
| Field  | Type             | Null | Key | Default | Extra          |
+--------+------------------+------+-----+---------+----------------+
| repnum | int(10) unsigned |      | PRI | NULL    | auto_increment |
| fname  | varchar(30)      | YES  |     | NULL    |                |
| lname  | varchar(50)      | YES  |     | NULL    |                |
| rate   | decimal(4,2)     | YES  |     | 5.25    |                |
+--------+------------------+------+-----+---------+----------------+
4 rows in set (0.05 sec)

A shorter way of saying the same thing is

DESC salesrep;

Perhaps we can't remember what a specific table is called. We can get a listing of all the tables in this database:

mysql> SHOW tables;
+----------------+
| Tables_in_mydb |
+----------------+
| customer       |
| sales          |
| salesrep       |
+----------------+
3 rows in set (0.00 sec)

And of course, we can see all the databases available:

mysql> SHOW databases;
+----------+
| Database |
+----------+
| mydb     |
| mysql    |
| test     |
+----------+
3 rows in set (0.06 sec)

The mysql database contains configuration and permission information. If you remember at the beginning we changed the root password. If you look back at that query now, it should make sense to you:

update user set password=password('YOUR_NEW_PASSWORD') where user='root';
flush privileges;

Important!You must put your new password inside the password('') function when setting a new passsword. Why? The password() function encrypts your password in a one-way (unencryptable) format. When you log in again, the log in automatically encrypts what you type in at the password prompt with the same function. What you type in will never match what's in the database if they are both not encrypted the same way.


What is a Join?

One of the more useful reasons for having a relational database is that we can relate data to other data based on something they have in common. For example, we can see all the customers who buy from Jane, or how much commission we are due Bob based on his customers sales.

In a very basic sense, a join says, "show me the data from this table and from that table".

Types of Joins
The most basic and common type of join is the INNER JOIN. This type of join says "show me all the data in this table and in that table that match each other".

For example:
    "Show me all the customers who have Jane as their Sales Rep".
    "Show me how much commission we are due Bob"
    "Show me all the customers that have placed an order"
Another type of join is the OUTER JOIN. This type of join says "show me all the data in this table, and also show me only the matching data in that table".

For example:
    "Show me all the customers, and show me how many times they have ordered"
    "Show me all the sales reps, and show me their total sales"
Another type of join is the CARTESIAN JOIN or the CROSS JOIN. This type of join says "show me all of the data in this table, and for each record, show me all the data in that table". A Cartesian Join is almost always an ERROR. If the tables involved have a few thousand records each, you can easily crash the server. Keep reading for more information about this.


Simple Joins

You can join two tables together this way:

SELECT * FROM salesrep, customer
WHERE customer.repnum = salesrep.repnum;

What this tells MySQL: get all the records from each table, making sure we only get records that match each other, based on the rep number. Here is our result:

+--------+-------+---------+------+---------+--------+--------+----------+-------+
| repnum | fname | lname   | rate | custnum | repnum | fname  | lname    | state |
+--------+-------+---------+------+---------+--------+--------+----------+-------+
|      1 | Bob   | Jones   | 5.25 |       1 |      1 | Jill   | Jackson  | ny    |
|      2 | Jane  | Swanson | 5.25 |       2 |      2 | Andrew | Fielding | mi    |
|      2 | Jane  | Swanson | 5.25 |       3 |      2 | Harold | Stanton  | ca    |
+--------+-------+---------+------+---------+--------+--------+----------+-------+
3 rows in set (0.00 sec)

As you can see, this is somewhat confusing because we don't know which information belongs to the customer, and which information belongs to the salesrep. Let's modify this query a little for clarity:

SELECT custnum, customer.fname AS cust_fname, customer.lname AS cust_lname, state,
salesrep.fname AS rep_fname, salesrep.lname AS rep_lname, salesrep.repnum, rate
FROM salesrep, customer
WHERE customer.repnum = salesrep.repnum;

This data is now a little clearer:

+---------+------------+------------+-------+-----------+-----------+--------+------+
| custnum | cust_fname | cust_lname | state | rep_fname | rep_lname | repnum | rate |
+---------+------------+------------+-------+-----------+-----------+--------+------+
|       1 | Jill       | Jackson    | ny    | Bob       | Jones     |      1 | 5.25 |
|       2 | Andrew     | Fielding   | mi    | Jane      | Swanson   |      2 | 5.25 |
|       3 | Harold     | Stanton    | ca    | Jane      | Swanson   |      2 | 5.25 |
+---------+------------+------------+-------+-----------+-----------+--------+------+
3 rows in set (0.00 sec)

Now, let's talk about a common error that occurs: The CARTESIAN or CROSS JOIN. If we forget to add our WHERE clause to this join, we will get the following result:

SELECT custnum, customer.fname AS cust_fname, customer.lname AS cust_lname, state,
salesrep.fname AS rep_fname, salesrep.lname AS rep_lname, salesrep.repnum, rate
FROM salesrep, customer;

+---------+------------+------------+-------+-----------+-----------+--------+------+
| custnum | cust_fname | cust_lname | state | rep_fname | rep_lname | repnum | rate |
+---------+------------+------------+-------+-----------+-----------+--------+------+
|       1 | Jill       | Jackson    | ny    | Bob       | Jones     |      1 | 5.25 |
|       1 | Jill       | Jackson    | ny    | Jane      | Swanson   |      2 | 5.25 |
|       1 | Jill       | Jackson    | ny    | Harry     | Brown     |      3 | 5.25 |
|       2 | Andrew     | Fielding   | mi    | Bob       | Jones     |      1 | 5.25 |
|       2 | Andrew     | Fielding   | mi    | Jane      | Swanson   |      2 | 5.25 |
|       2 | Andrew     | Fielding   | mi    | Harry     | Brown     |      3 | 5.25 |
|       3 | Harold     | Stanton    | ca    | Bob       | Jones     |      1 | 5.25 |
|       3 | Harold     | Stanton    | ca    | Jane      | Swanson   |      2 | 5.25 |
|       3 | Harold     | Stanton    | ca    | Harry     | Brown     |      3 | 5.25 |
+---------+------------+------------+-------+-----------+-----------+--------+------+
9 rows in set (0.00 sec)

What happened? Basically, SQL said "give me everything you have". MySQL happily returned the first record of the first table, then every record from the second table. Then it returned the second record from the first table, and every record (again) from the second table. And so on.

This kind of join can wreck havok on your server. We retrieved 9 records from two tables of 3 records each. If you have 10,000 records in each table (which is not a lot, really), you will receive 100,000,000 records. Imagine if you had a few good sized tables of a few hundred thousand records each.

If you do this type of join, you should grin sheepishly and reboot your computer. Always test your work on a development server before running it on the live server.

Outer Joins

Sometimes, you want all of the records in the first table, but only the matching records in another table. Let's say you want to see all of the customers, and any sales they may have placed.

mysql> SELECT fname, lname, total, saledate
FROM customer LEFT OUTER JOIN sales ON customer.custnum  = sales.custnum;
+--------+----------+--------+---------------------+
| fname  | lname    | total  | saledate            |
+--------+----------+--------+---------------------+
| Jill   | Jackson  | 101.29 | 2005-05-02 15:38:05 |
| Jill   | Jackson  | 206.32 | 2005-05-02 15:38:05 |
| Andrew | Fielding |  96.34 | 2005-05-02 15:38:05 |
| Harold | Stanton  |   NULL |                NULL |
+--------+----------+--------+---------------------+
4 rows in set (0.00 sec)

This tells us that Jill Jackson has placed two orders, Andrew Fielding has placed one order, and Harold Stanton has placed no orders. Because we performed an OUTER join, we retrieved all the customers, even if they were not also found in the sales table.

Subqueries

I will quickly cover the topic of subqueries. You can select records in one query based on the results of another query.

mysql> SELECT fname FROM salesrep WHERE repnum IN (SELECT repnum FROM customer);
+-------+
| fname |
+-------+
| Bob   |
| Jane  |
+-------+
2 rows in set (0.00 sec)

This query gives us the first name of each sales rep that has a customer in the "customer" table. Subqueries aren't used frequently, since most problems can be handled another way. But you may find yourself in a situation where it might come in handy.

NULL

What is NULL? NULL means "empty". NULL is different from an empty string (entered as two quotation marks together '', and shown in the database as an empty field). NULL means "the absence of all data".

Excercises

Here is our database:

mysql> select * from salesrep;
+--------+-------+---------+------+
| repnum | fname | lname   | rate |
+--------+-------+---------+------+
|      1 | Bob   | Jones   | 5.25 |
|      2 | Jane  | Swanson | 5.25 |
|      3 | Harry | Brown   | 5.25 |
+--------+-------+---------+------+
3 rows in set (0.01 sec)

mysql> select * from customer;
+---------+--------+--------+----------+-------+
| custnum | repnum | fname  | lname    | state |
+---------+--------+--------+----------+-------+
|       1 |      1 | Jill   | Jackson  | ny    |
|       2 |      2 | Andrew | Fielding | ny    |
|       3 |      2 | Harold | Stanton  | ca    |
+---------+--------+--------+----------+-------+
3 rows in set (0.03 sec)

mysql> select * from sales;
+--------+---------+--------+---------------------+
| invnum | custnum | total  | saledate            |
+--------+---------+--------+---------------------+
|      1 |       1 | 101.29 | 2005-05-02 15:38:05 |
|      2 |       2 |  96.34 | 2005-05-02 15:38:05 |
|      3 |       1 | 206.32 | 2005-05-02 15:38:05 |
+--------+---------+--------+---------------------+
3 rows in set (0.01 sec)

Based on our current database, try to figure out what these queries will return before you run them:

SELECT *
FROM customer
WHERE state = 'ny';

SELECT COUNT(*)
FROM sales;

SELECT fname, lname, total
FROM sales, customer
WHERE sales.custnum = customer.custnum
	ORDER BY total desc;

SELECT sum(total)
FROM sales, salesrep, customer
WHERE sales.custnum = customer.custnum
	AND customer.repnum = salesrep.repnum
	AND salesrep.fname = 'Bob';

SELECT total / rate as commission, salesrep.fname, salesrep.lname
FROM sales, salesrep, customer
WHERE sales.custnum = customer.custnum
	AND customer.repnum = salesrep.repnum
GROUP BY salesrep.repnum
ORDER BY commission desc;


Further Resources

The MySQL site has comprehensive documentation:
http://www.mysql.com/

Many thanks to Kevin Montgomery, Provisional R.G.D., for his feedback and assistance in proofreading this article. Kevin can be reached at his website http://www.kemosite.com/
Want to republish this article on your site?
This article may be republished on your personal, non-profit, or commercial website, blog or E-Zine free of charge provided there is an active link to www.MsSEM.com and that this copyright notice is included. The article must be publicly and freely available - without a charge for the content.

Copyright 2006 Helen M. Overland, All Rights Reserved
www.MsSEM.com
SEO Search:    Search the Web with an SEO Focus: