Normal Forms (1st to 3rd)
Guest lecture by K. Yue (November 29th, 2001)
This supplementary note is for a guest lecture on Dr. Boetticher's class on November 29th, 2001. Refer to the main note by Dr. Boetticher. The notes are based on my old notes in database classes.
Lossless Decomposition
It is assumed that the students are already familiar with the theory of lossless decomposition, including the concepts of closure, minimal covers, etc, as well as the 'matrix algorithm' for testing lossless decomposition.
I have written several Lisp programs long time ago for getting closure, cover and determining lossless decomposition. If you are interesting in reading, they are here:
Relational Database Designs
Example:
Relation Scheme:
EMPLOYEE(EMP_NO, NAME, DEPT_NO, MANAGER_NO).
with the following assumptions:
An instance of the relation:
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 54321 |
| 20000 | Art Garfunkel | D123 | 54321 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D225 | 42315 |
| 22000 | Magic Johnson | D225 | 42315 |
| 31000 | Carl Sagan | D337 | 33323 |
Redundancy:
The fact Manager 54321 manages department D123 is stored three times.
Update Anomalies:
Update: 44444 is now the new manager of department D123
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 44444 |
| 20000 | Art Garfunkel | D123 | 44444 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D225 | 42315 |
| 22000 | Magic Johnson | D225 | 42315 |
| 31000 | Carl Sagan | D337 | 33323 |
Update: Magic Johnson now works for department D337:
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 54321 |
| 20000 | Art Garfunkel | D123 | 54321 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D225 | 42315 |
| 22000 | Magic Johnson | D337 | ????? |
| 31000 | Carl Sagan | D337 | 33323 |
Update: Nolan Ryan now works for the new department D822, which does not have a manager yet.
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 54321 |
| 20000 | Art Garfunkel | D123 | 54321 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D822 | ????? |
| 22000 | Magic Johnson | D225 | 42315 |
| 31000 | Carl Sagan | D337 | 33323 |
Insertion Anomalies:
Insert: a new department D888, with manager 77111, currently with no employee is not possible.
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 54321 |
| 20000 | Art Garfunkel | D123 | 54321 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D225 | 42315 |
| 22000 | Magic Johnson | D225 | 42315 |
| 31000 | Carl Sagan | D337 | 33323 |
| ????? | ????? | D888 | 77111 |
Deletion Anomalies:
Delete: Carl Sagan no longer works here.
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 54321 |
| 20000 | Art Garfunkel | D123 | 54321 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D225 | 42315 |
| 22000 | Magic Johnson | D225 | 42315 |
Solution
Decomposition to relations that satisfy desirable normal forms.
EMP(EMP_NO, NAME, DEPT_NO)
DEPARTMENT(DEPT_NO, MANAGER_NO)
All anomalies mentioned above are now gone.
Normal Forms
First Normal Form
Example:
Consider the following table. It is not in 1 NF.
| DEPT_NO | MANAGER_NO | EMP_NO | EMP_NAME |
| D101 | 12345 | 20000 20001 20002 |
Carl Sagan Magic Johnson Larry Bird |
| D102 | 13456 | 30000 30001 |
Jimmy Carter Paul Simon |
The corresponding relation in 1 NF:
| DEPT_NO | MANAGER_NO | EMP_NO | EMP_NAME |
| D101 | 12345 | 20000 | Carl Sagan |
| D101 | 12345 | 20001 | Magic Johnson |
| D101 | 12345 | 20002 | Larry Bird |
| D102 | 13456 | 30000 | Jimmy Carter |
| D102 | 13456 | 30001 | Paul Simon |
Second Normal Form
Example:
The following relation is not in 2NF. The relation has the following FD:
Student_ID, Course -> Grade
Course -> Credit
Note the redundancy and anomalies.
| Student_ID | Course | Credit | Grade |
| S1 | CSCI 5333 | 3 | A |
| S1 | CSCI 4230 | 3 | A |
| S2 | CSCI 5333 | 3 | B- |
| S2 | CSCI 4230 | 3 | C |
| S3 | CSCI 5333 | 3 | B+ |
Third Normal Form
Example:
The example relation for anomalies is not in 3NF.
EMPLOYEE(EMP_NO, NAME, DEPT_NO, MANAGER_NO).
with the following assumptions:
An instance of the relation:
| EMP_NO | NAME | DEPT_NO | MANAGER_NO |
| 10000 | Paul Simon | D123 | 54321 |
| 20000 | Art Garfunkel | D123 | 54321 |
| 13000 | Tom Jones | D123 | 54321 |
| 21000 | Nolan Ryan | D225 | 42315 |
| 22000 | Magic Johnson | D225 | 42315 |
| 31000 | Carl Sagan | D337 | 33323 |
Example:
Consider R(A,B,C) with the minimal cover F: {A -> B}. Note that F |- B -> B, or B -> Bis in F+. For B -> B, B is not a superkey and B is non-prime. However, B -> B is not a violation of 3NF as it is trivial and should not be considered for potential violation.
Example:
Consider the relation
S(SUPP#, PART#, SNAME, QUANTITY) with the following assumptions:
(1) SUPP# is unique for every supplier.
(2) SNAME is unique for every supplier.
(3) QUANTITY is the accumulated quantities of a part supplied by a supplier.
(4) A supplier can supply more than one part.
(5) A part can be supplied by more than one supplier.
We can find the following nontrivial functional dependencies:
(1) SUPP# --> SNAME
(2) SNAME --> SUPP#
(3) SUPP# PART# --> QUANTITY
(4) SNAME PART# --> QUANTITY
Note that SUPP# and SNAME are equivalent.
The candidate keys are:
(1) SUPP# PART#
(2) SNAME PART#
The relation is in 3NF.
However, the relation has unnecessary redundancy:
| SUPP# | SNAME | PART# | QUANTITY |
| S1 | Yues | P1 | 100 |
| S1 | Yues | P2 | 200 |
| S1 | Yues | P3 | 250 |
| S2 | Jones | P1 | 300 |
We need a higher normal: BCNF. The relation above is not in BCNF.
Dr. Kwok-Bun Yue
Professor, Computer Science and Computer Information Systems
Chair, Division of Computing and Mathematics
University of Houston-Clear Lake
2700 Bay Area Boulevard
Houston, TX 77058
Yue's home page
yue@uhcl.edu
281-283-3864