DB Development in SQL
by K. Yue
1. Introduction
2. System Catalog
Example:
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sakila |
| test |
| world |
| yue_exp |
+--------------------+
7 rows in set (0.00 sec)
>
mysql database
Example:
Try to add a new user by inserting into the user table: http://dev.mysql.com/doc/refman/5.6/en/adding-users.html. There are problems.
use mysql;
Try:
INSERT INTO user (Host,User,Password,select_priv, insert_priv, update_priv,ssl_cipher,x509_issuer,x509_subject)
VALUES ('localhost', 'abcde', PASSWORD('IloveDBMS'), 'Y', 'Y', 'Y', 'NULL', 'NULL', 'NULL');
FLUSH PRIVILEGES;
information_schema database
Example:
mysql> SELECT table_name, table_type, row_format, table_rows, avg_row_length
-> FROM information_schema.tables
-> WHERE table_schema = 'information_schema'
-> ORDER BY table_name DESC;
+---------------------------------------+-------------+------------+------------+----------------+
| table_name | table_type | row_format | table_rows | avg_row_length |
+---------------------------------------+-------------+------------+------------+----------------+
| VIEWS | SYSTEM VIEW | Dynamic | NULL | 0 |
| USER_PRIVILEGES | SYSTEM VIEW | Fixed | NULL | 1986 |
| TRIGGERS | SYSTEM VIEW | Dynamic | NULL | 0 |
| TABLE_PRIVILEGES | SYSTEM VIEW | Fixed | NULL | 2372 |
| TABLE_CONSTRAINTS | SYSTEM VIEW | Fixed | NULL | 2504 |
| TABLESPACES | SYSTEM VIEW | Fixed | NULL | 6951 |
| TABLES | SYSTEM VIEW | Fixed | NULL | 9450 |
| STATISTICS | SYSTEM VIEW | Fixed | NULL | 5753 |
| SESSION_VARIABLES | SYSTEM VIEW | Fixed | NULL | 3268 |
| SESSION_STATUS | SYSTEM VIEW | Fixed | NULL | 3268 |
| SCHEMA_PRIVILEGES | SYSTEM VIEW | Fixed | NULL | 2179 |
| SCHEMATA | SYSTEM VIEW | Fixed | NULL | 3464 |
| ROUTINES | SYSTEM VIEW | Dynamic | NULL | 0 |
| REFERENTIAL_CONSTRAINTS | SYSTEM VIEW | Fixed | NULL | 4814 |
| PROFILING | SYSTEM VIEW | Fixed | NULL | 308 |
| PROCESSLIST | SYSTEM VIEW | Dynamic | NULL | 0 |
| PLUGINS | SYSTEM VIEW | Dynamic | NULL | 0 |
| PARTITIONS | SYSTEM VIEW | Dynamic | NULL | 0 |
| PARAMETERS | SYSTEM VIEW | Dynamic | NULL | 0 |
| KEY_COLUMN_USAGE | SYSTEM VIEW | Fixed | NULL | 4637 |
| INNODB_TRX | SYSTEM VIEW | Fixed | NULL | 4534 |
| INNODB_LOCK_WAITS | SYSTEM VIEW | Fixed | NULL | 599 |
| INNODB_LOCKS | SYSTEM VIEW | Fixed | NULL | 31244 |
| INNODB_CMP_RESET | SYSTEM VIEW | Fixed | NULL | 25 |
| INNODB_CMPMEM_RESET | SYSTEM VIEW | Fixed | NULL | 29 |
| INNODB_CMPMEM | SYSTEM VIEW | Fixed | NULL | 29 |
| INNODB_CMP | SYSTEM VIEW | Fixed | NULL | 25 |
| GLOBAL_VARIABLES | SYSTEM VIEW | Fixed | NULL | 3268 |
| GLOBAL_STATUS | SYSTEM VIEW | Fixed | NULL | 3268 |
| FILES | SYSTEM VIEW | Fixed | NULL | 2677 |
| EVENTS | SYSTEM VIEW | Dynamic | NULL | 0 |
| ENGINES | SYSTEM VIEW | Fixed | NULL | 490 |
| COLUMN_PRIVILEGES | SYSTEM VIEW | Fixed | NULL | 2565 |
| COLUMNS | SYSTEM VIEW | Dynamic | NULL | 0 |
| COLLATION_CHARACTER_SET_APPLICABILITY | SYSTEM VIEW | Fixed | NULL | 195 |
| COLLATIONS | SYSTEM VIEW | Fixed | NULL | 231 |
| CHARACTER_SETS | SYSTEM VIEW | Fixed | NULL | 384 |
+---------------------------------------+-------------+------------+------------+----------------+
37 rows in set (0.08 sec)
Example:
SELECT t.TABLE_SCHEMA AS `schema`, COUNT(t.TABLE_NAME) AS num_tables
FROM information_schema.tables t
GROUP BY `schema`
ORDER BY num_tables DESC;
SELECT t.TABLE_SCHEMA AS `schema`, t.ENGINE, COUNT(t.TABLE_NAME) AS num_tables
FROM information_schema.tables t
GROUP BY `schema`, t.ENGINE
ORDER BY num_tables DESC;
performance_schema database
Sakila database
3. Views
CREATE
[OR REPLACE]
[ALGORITHM = {UNDEFINED | MERGE | TEMPTABLE}]
[DEFINER = { user | CURRENT_USER }]
[SQL SECURITY { DEFINER | INVOKER }]
VIEW view_name [(column_list)]
AS select_statement
[WITH [CASCADED | LOCAL] CHECK OPTION]
Example:
The Sakila database comes preloaded with MySQL installation. Refer to the documentation at by googling sakila. Download the database at review the SQL statements for declaring and populating the database. (Note that XAMPP does not include Sakila and you will need to install it yourself.)
Define a view to provide information of all paying customers: their id, the total sum of their payment, and the total sum of their payment towards film rentals.
DROP VIEW customer_payment;
CREATE DEFINER=CURRENT_USER SQL SECURITY INVOKER VIEW customer_payment
AS
SELECT
p.customer_id AS customer,
sum(p.amount) AS "Total Payment",
sum(IF(p.rental_id IS NULL, 0, p.amount)) AS "Film Rental Payment"
FROM payment as p
GROUP BY p.customer_id;
MariaDB [sakila]> SELECT * FROM customer_payment LIMIT 20;
+----------+---------------+---------------------+
| customer | Total Payment | File Rental Payment |
+----------+---------------+---------------------+
| 1 | 118.68 | 118.68 |
| 2 | 128.73 | 128.73 |
| 3 | 135.74 | 135.74 |
| 4 | 81.78 | 81.78 |
| 5 | 144.62 | 144.62 |
| 6 | 93.72 | 93.72 |
| 7 | 151.67 | 151.67 |
| 8 | 92.76 | 92.76 |
| 9 | 89.77 | 89.77 |
| 10 | 99.75 | 99.75 |
| 11 | 106.76 | 106.76 |
| 12 | 103.72 | 103.72 |
| 13 | 131.73 | 131.73 |
| 14 | 117.72 | 117.72 |
| 15 | 134.68 | 134.68 |
| 16 | 120.71 | 118.72 |
| 17 | 98.79 | 98.79 |
| 18 | 91.78 | 91.78 |
| 19 | 125.76 | 125.76 |
| 20 | 115.70 | 115.70 |
+----------+---------------+---------------------+
Note:
Example
Define the view category_film_count to show the category names, the numbers of inventoried films and the total number of copies of films in the categories.
For example:
mysql> SELECT * from category_film_count;
+-------------+--------------------------+-----------------------+
| category | Number of distinct films | Total copies of films |
+-------------+--------------------------+-----------------------+
| Action | 61 | 312 |
| Animation | 64 | 335 |
| Children | 58 | 269 |
| Classics | 54 | 270 |
| Comedy | 56 | 269 |
| Documentary | 63 | 294 |
| Drama | 61 | 300 |
| Family | 67 | 310 |
| Foreign | 67 | 300 |
| Games | 58 | 276 |
| Horror | 53 | 248 |
| Music | 51 | 232 |
| New | 60 | 275 |
| Sci-Fi | 59 | 312 |
| Sports | 73 | 344 |
| Travel | 53 | 235 |
+-------------+--------------------------+-----------------------+
16 rows in set (0.01 sec)
Suggested Solution:
DROP VIEW category_film_count;
CREATE VIEW category_film_count
AS
SELECT
cat.name AS category,
count(DISTINCT fc.film_id) AS "Number of distinct films",
count(i.inventory_id) AS "Total copies of films"
FROM category AS cat
LEFT JOIN film_category AS fc ON cat.category_id = fc.category_id
JOIN inventory AS i ON fc.film_id = i.film_id
GROUP BY category;
Note that there are restrictions in View. For examples, in MySQL 5.7:
4. Stored Procedures
Example:
DROP PROCEDURE AllSuppliers;
-- A very simple stored procedure
DELIMITER //
CREATE PROCEDURE AllSuppliers()
BEGIN
SELECT * FROM Supplier;
END //
DELIMITER ;
-- Call
CALL AllSuppliers;
Note:
Example:
DROP FUNCTION GetAllSuppliers;
-- A very simple stored function
DELIMITER //
CREATE FUNCTION GetAllSuppliers() RETURNS INT
BEGIN
DECLARE count INT DEFAULT 0;
SET count = 0; -- Not really needed.
SELECT COUNT(*) INTO count FROM Supplier;
-- The following is not allowed as result set in function.
-- SELECT * FROM Supplier;
RETURN count;
END //
DELIMITER ;
-- Call
SELECT GetAllSuppliers();
Note:
Example:
-- Parts supplied by a supplier
DROP PROCEDURE PartSupplied;
DELIMITER //
CREATE PROCEDURE PartSupplied(
IN snum VARCHAR(9),
IN max INT, -- Maximum number of records shown
OUT numberPartsSupplied INT -- Number of different parts supplied
)
BEGIN
SELECT COUNT(*) INTO numberPartsSupplied
FROM Supply S
WHERE S.SNUM = snum;
SELECT P.PNUM, P.PNAME, S.Quantity
FROM Supply S, Part P
WHERE S.SNUM = snum
AND S.PNUM = P.PNUM
LIMIT max;
END //
DELIMITER ;
-- Call
SET @numberPartsSupplied = 0;
CALL PartSupplied('S2', 2, @numberPartsSupplied);
SELECT @numberPartsSupplied;
Note:
Stored Program Syntax
Example on Sakila:
(1) Define a view, actor_appearence, in which each row of the view contains an actor with the following columns:
For example:
mysql> SELECT *
-> FROM actor_appearence
-> LIMIT 10;
+----------+------------+--------------+-----------+----------------+
| actor_id | first_name | last_name | num_films | num_categories |
+----------+------------+--------------+-----------+----------------+
| 1 | PENELOPE | GUINESS | 19 | 13 |
| 2 | NICK | WAHLBERG | 25 | 14 |
| 3 | ED | CHASE | 22 | 10 |
| 4 | JENNIFER | DAVIS | 22 | 12 |
| 5 | JOHNNY | LOLLOBRIGIDA | 29 | 13 |
| 6 | BETTE | NICHOLSON | 20 | 11 |
| 7 | GRACE | MOSTEL | 30 | 11 |
| 8 | MATTHEW | JOHANSSON | 20 | 10 |
| 9 | JOE | SWANK | 25 | 14 |
| 10 | CHRISTIAN | GABLE | 22 | 15 |
+----------+------------+--------------+-----------+----------------+
10 rows in set (0.05 sec)
Solution:
DROP VIEW actor_appearence;
CREATE VIEW actor_appearence
AS
SELECT a.actor_id, a.first_name, a.last_name,
COUNT(DISTINCT fa.film_id) AS num_films,
COUNT(DISTINCT fc.category_id) AS num_categories
FROM actor a, film_actor fa, film_category fc
WHERE a.actor_id = fa.actor_id
AND fa.film_id = fc.film_id
GROUP BY a.actor_id, a.first_name, a.last_name;
-- Testing:
SELECT *
FROM actor_appearence
LIMIT 10;
(2) Using the view actor_appearence, provide the SQL statement to list all actors who have appeared in more films than HENRY BERRY in the exact format below.
+-----+----------------+-----------------+
| ID | Name | Number of Films |
+-----+----------------+-----------------+
| 107 | GINA DEGENERES | 42 |
| 102 | WALTER TORN | 41 |
| 198 | MARY KEITEL | 40 |
| 181 | MATTHEW CARREY | 39 |
| 23 | SANDRA KILMER | 37 |
| 81 | SCARLETT DAMON | 36 |
+-----+----------------+-----------------+
6 rows in set (0.10 sec)
Solution:
SELECT actor_id AS "ID",
CONCAT(first_name, ' ', last_name) AS "Name",
num_films as "Number of Films"
FROM actor_appearence
WHERE num_films >
(SELECT num_films
FROM actor_appearence
WHERE first_name = 'HENRY'
AND last_name = 'BERRY')
ORDER BY num_films DESC;
(3) Write a function num_same_film_rented(cid_1 INT, cid_2 INT) to return the number of films both customers cid_1 and cid_2 have rented. For example:
mysql> SELECT num_same_film_rented(1, 558);
+------------------------------+
| num_same_film_rented(1, 558) |
+------------------------------+
| 3 |
+------------------------------+
1 row in set (0.01 sec)
Solution:
DROP FUNCTION num_same_film_rented;
DELIMITER $$
CREATE FUNCTION num_same_film_rented(cid_1 INT, cid_2 INT) RETURNS FLOAT
READS SQL DATA
BEGIN
DECLARE result INTEGER DEFAULT 0;
SELECT COUNT(DISTINCT i1.film_id) INTO result
FROM rental r1, rental r2, inventory i1, inventory i2
WHERE r1.inventory_id = i1.inventory_id
AND r2.inventory_id = i2.inventory_id
AND i1.film_id = i2.film_id
AND r1.customer_id = cid_1
AND r2.customer_id = cid_2;
return result;
END $$
DELIMITER ;
SELECT num_same_film_rented(1, 558);
(4) Write a procedure, PROCEDURE most_similar_customers(cid INT, min_count INT) to return the customers most similar to cid. The most similar customers are defined to be the ones with at least min_count number of same rented films as customer cid.
For example:
mysql> CALL most_similar_customers(1,3);
+-------------+-----------------------------+
| customer_id | Number of same rented films |
+-------------+-----------------------------+
| 406 | 4 |
| 447 | 4 |
| 383 | 4 |
| 52 | 4 |
| 155 | 4 |
| 56 | 3 |
| 161 | 3 |
| 187 | 3 |
| 513 | 3 |
| 558 | 3 |
| 5 | 3 |
| 59 | 3 |
| 102 | 3 |
| 189 | 3 |
| 297 | 3 |
| 503 | 3 |
| 45 | 3 |
| 380 | 3 |
| 277 | 3 |
| 448 | 3 |
| 479 | 3 |
| 16 | 3 |
| 173 | 3 |
| 253 | 3 |
| 426 | 3 |
| 460 | 3 |
| 28 | 3 |
| 112 | 3 |
| 230 | 3 |
| 269 | 3 |
| 372 | 3 |
| 303 | 3 |
| 375 | 3 |
| 485 | 3 |
+-------------+-----------------------------+
34 rows in set (0.01 sec)
Query OK, 0 rows affected (0.15 sec)
Solution:
DROP PROCEDURE most_similar_customers;
DELIMITER $$
CREATE PROCEDURE most_similar_customers(cid INT, min_count INT)
READS SQL DATA
BEGIN
SELECT r2.customer_id, COUNT(DISTINCT i1.film_id) AS "Number of same rented films"
FROM rental r1, rental r2, inventory i1, inventory i2
WHERE r1.inventory_id = i1.inventory_id
AND r2.inventory_id = i2.inventory_id
AND i1.film_id = i2.film_id
AND r1.customer_id = cid
AND r2.customer_id <> cid
GROUP BY r2.customer_id
HAVING COUNT(DISTINCT i1.film_id) >= min_count
ORDER BY COUNT(DISTINCT i1.film_id) DESC;
END $$
DELIMITER ;
CALL most_similar_customers(1,3);
(5) Write a function, category_feature_length(customer_id INT), to compute the length of the category feature vector. Each element in the category festure vector is the number of the flims the customer has rented in each category in order.
For example, for customer 1, the category feature vector is:
(2,2,6,5,2,4,1,1,1,4,2,2,2,2,1).
For customer 20, it is:
(1,1,3,2,3,2,2,1,1,1,1,7,0,4,1). Note the 0 as the 14th element.
The length of the vector is the square root of the sum of each element.
Solution:
--
-- List the number of films rented in film categories for each customer.
--
DROP VIEW customer_category_rental;
CREATE VIEW customer_category_rental
AS
SELECT
r.customer_id AS customer_id,
fc.category_id AS category_id,
count(*) AS count
FROM rental AS r
INNER JOIN inventory AS i ON r.inventory_id = i.inventory_id
INNER JOIN film AS f ON i.film_id = f.film_id
INNER JOIN film_category AS fc ON f.film_id = fc.film_id
GROUP BY r.customer_id, fc.category_id
ORDER BY r.customer_id, fc.category_id;
DROP FUNCTION category_feature_length;
DELIMITER $$
CREATE FUNCTION category_feature_length(customer_id INT) RETURNS FLOAT
READS SQL DATA
BEGIN
DECLARE result FLOAT DEFAULT NULL;
SELECT SUM(Count * Count) INTO result
FROM customer_category_rental AS C
WHERE c.customer_id = customer_id;
IF Result IS NULL THEN
Return 0;
ELSE
Return SQRT(Result);
END IF;
END $$
DELIMITER ;
Example:
(6) Write a function, category_feature_distance(cid_1 INT, cid_2 INT), to compute the distance between two customers (with cutomer id cid_1 and cid_2). The function does not need to be efficient.
For example,
mysql> select category_feature_distance(1,20);
+---------------------------------+
| category_feature_distance(1,20) |
+---------------------------------+
| 0.08313219994306564 |
+---------------------------------+
1 row in set, 1 warning (4.80 sec)
Solution:
SELECT category_feature_length(1);
-- inefficient version using cursor.
DROP FUNCTION category_feature_distance;
DELIMITER $$
CREATE FUNCTION category_feature_distance(cid_1 INT, cid_2 INT) RETURNS FLOAT
READS SQL DATA
BEGIN
DECLARE result FLOAT;
DECLARE length_1 FLOAT;
DECLARE length_2 FLOAT;
DECLARE done INT DEFAULT FALSE;
DECLARE cat_id INT;
DECLARE count_1 INT;
DECLARE count_2 INT;
DECLARE cursor_1 CURSOR FOR SELECT distinct category_id from category;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
SET length_1 = category_feature_length(cid_1);
IF length_1 <= 0.0 THEN
RETURN NULL;
END IF;
SET length_2 = category_feature_length(cid_2);
IF length_2 <= 0.0 THEN
RETURN NULL;
END IF;
OPEN cursor_1;
SET result = 0;
compute_loop: LOOP
-- This is needed as there are other SQL statements that can set Done to true.
SET done = FALSE;
FETCH cursor_1 INTO cat_id;
IF done THEN
LEAVE compute_loop;
END IF;
SET count_1 = 0;
SELECT count INTO count_1
FROM customer_category_rental as C
WHERE C.category_id = cat_id
AND C.customer_id = cid_1;
SET count_2 = 0;
SELECT count INTO count_2
FROM customer_category_rental as C
WHERE C.category_id = cat_id
AND C.customer_id = cid_2;
SET result = result + (count_1 - count_2) * (count_1 - count_2);
END LOOP;
CLOSE cursor_1;
return SQRT(result) / (length_1 * length_2);
END $$
DELIMITER ;
5. Triggers
Example:
Using the Sakila database, consider the tables inventory and rental.
inventory (inventory_id, film_id, store_id, last_update)
rental (rental_id, rental-date, inventory_id, customer_id, return_date, staff_id, last_update)
The stored function inventory_in_stock(p_inventory_id INT) checks whether an inventory item is in stock. Its logic:
"AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED."
CREATE FUNCTION inventory_in_stock(p_inventory_id INT) RETURNS BOOLEAN
READS SQL DATA
BEGIN
DECLARE v_rentals INT;
DECLARE v_out INT;
#AN ITEM IS IN-STOCK IF THERE ARE EITHER NO ROWS IN THE rental TABLE
#FOR THE ITEM OR ALL ROWS HAVE return_date POPULATED
-- Redundant as the test is covered by the next SQL statement
SELECT COUNT(*) INTO v_rentals
FROM rental
WHERE inventory_id = p_inventory_id;
IF v_rentals = 0 THEN
RETURN TRUE;
END IF;
SELECT COUNT(rental_id) INTO v_out
FROM inventory LEFT JOIN rental USING(inventory_id)
WHERE inventory.inventory_id = p_inventory_id
AND rental.return_date IS NULL;
IF v_out > 0 THEN
RETURN FALSE;
ELSE
RETURN TRUE;
END IF;
END $$
DELIMITER ;
This can be inefficient.
We consider modifying the table inventory to speed up performance.
inventory (inventory_id, film_id, store_id, lost, in_stock, last_update)
Two columns are added:
We may consider that the column in_stock as a derived column. Its value can be derived from other columns and tables in the database. Maintaining data consistency of derived columns is a popular use of triggers.
ALTER TABLE inventory
ADD COLUMN lost BOOLEAN DEFAULT FALSE AFTER store_id;
ALTER TABLE inventory
ADD COLUMN in_stock BOOLEAN DEFAULT TRUE AFTER lost;
To reverse the operations:
ALTER TABLE inventory DROP COLUMN lost;
ALTER TABLE inventory DROP COLUMN in_stock;
To populate the columns initially:
UPDATE inventory
SET lost = FALSE;
UPDATE inventory
SET in_stock = inventory_in_stock(inventory_id);
-- checking
select * from inventory limit 5;
We now need to define triggers when we insert or update a rental row. We assume that a rental row will never be deleted.
The trigger after inserting a rental row is shown below. There may be an error. Can you spot it?
DELIMITER $$
CREATE TRIGGER insert_rental AFTER INSERT ON rental FOR EACH ROW
BEGIN
IF (new.return_date IS NULL)
THEN
UPDATE inventory
SET in_stock = FALSE
WHERE inventory.inventory_id = rental.inventory_id;
END IF;
END$$
DELIMITER ;
Note:
Checking:
-- Checking insert trigger.
SELECT * FROM rental WHERE inventory_id = 1;
SELECT * FROM inventory WHERE inventory_id = 1;
-- Rent inventory_id 1 out.
INSERT INTO rental
VALUES (16050,'2011-11-11 22:53:30',1,130,NULL,1,'2011-11-11 23:30:53');
SELECT * FROM rental WHERE inventory_id = 1;
SELECT * FROM inventory WHERE inventory_id = 1;
For the update trigger:
DELIMITER $$
CREATE TRIGGER update_rental AFTER UPDATE ON rental FOR EACH ROW
BEGIN
IF ((old.return_date IS NULL) and (new.return_date IS NOT NULL))
THEN
UPDATE inventory
SET in_stock = TRUE
WHERE inventory.inventory_id = new.inventory_id;
END IF;
END$$
DELIMITER ;
To check:
-- Checking insert trigger.
SELECT * FROM rental WHERE inventory_id = 1;
SELECT * FROM inventory WHERE inventory_id = 1;
UPDATE rental
SET return_date = '2011-11-11 23:30:53'
WHERE rental_id = 16050;
SELECT * FROM rental WHERE inventory_id = 1;
SELECT * FROM inventory WHERE inventory_id = 1;
-- House keeping to revert to original state
DELETE FROM rental WHERE rental_id = 16050;
DROP TRIGGER
insert_rental;
DROP TRIGGER update_rental;
ALTER TABLE inventory DROP COLUMN lost;
ALTER TABLE inventory DROP COLUMN in_stock;