Normal Forms (1st to 3rd)

Guest lecture by K. Yue (November 29th, 2001)

This supplementary note is for a guest lecture on Dr. Boetticher's class on November 29th, 2001. Refer to the main note by Dr. Boetticher. The notes are based on my old notes in database classes.

Lossless Decomposition

It is assumed that the students are already familiar with the theory of lossless decomposition, including the concepts of closure, minimal covers, etc, as well as the 'matrix algorithm' for testing lossless decomposition.

I have written several Lisp programs long time ago for getting closure, cover and determining lossless decomposition. If you are interesting in reading, they are here:

Relational Database Designs

Example:

Relation Scheme:

EMPLOYEE(EMP_NO, NAME, DEPT_NO, MANAGER_NO).

with the following assumptions:

An instance of the relation:

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 54321
20000 Art Garfunkel D123 54321
13000 Tom Jones D123 54321
21000 Nolan Ryan D225 42315
22000 Magic Johnson D225 42315
31000 Carl Sagan D337 33323

Redundancy:

The fact Manager 54321 manages department D123 is stored three times.

Update Anomalies:

Update: 44444 is now the new manager of department D123

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 44444
20000 Art Garfunkel D123 44444
13000 Tom Jones D123 54321
21000 Nolan Ryan D225 42315
22000 Magic Johnson D225 42315
31000 Carl Sagan D337 33323

Update: Magic Johnson now works for department D337:

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 54321
20000 Art Garfunkel D123 54321
13000 Tom Jones D123 54321
21000 Nolan Ryan D225 42315
22000 Magic Johnson D337 ?????
31000 Carl Sagan D337 33323

Update: Nolan Ryan now works for the new department D822, which does not have a manager yet.

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 54321
20000 Art Garfunkel D123 54321
13000 Tom Jones D123 54321
21000 Nolan Ryan D822 ?????
22000 Magic Johnson D225 42315
31000 Carl Sagan D337 33323

Insertion Anomalies:

Insert: a new department D888, with manager 77111, currently with no employee is not possible.

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 54321
20000 Art Garfunkel D123 54321
13000 Tom Jones D123 54321
21000 Nolan Ryan D225 42315
22000 Magic Johnson D225 42315
31000 Carl Sagan D337 33323
????? ????? D888 77111


Deletion Anomalies:

Delete: Carl Sagan no longer works here.

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 54321
20000 Art Garfunkel D123 54321
13000 Tom Jones D123 54321
21000 Nolan Ryan D225 42315
22000 Magic Johnson D225 42315

Solution

Decomposition to relations that satisfy desirable normal forms.

EMP(EMP_NO, NAME, DEPT_NO)
DEPARTMENT(DEPT_NO, MANAGER_NO)

All anomalies mentioned above are now gone.

Normal Forms

First Normal Form

Example:

Consider the following table. It is not in 1 NF.

DEPT_NO MANAGER_NO EMP_NO EMP_NAME
D101 12345 20000
20001
20002
Carl Sagan
Magic Johnson
Larry Bird
D102 13456 30000
30001
Jimmy Carter
Paul Simon

The corresponding relation in 1 NF:

DEPT_NO MANAGER_NO EMP_NO EMP_NAME
D101 12345 20000 Carl Sagan
D101 12345 20001 Magic Johnson
D101 12345 20002 Larry Bird
D102 13456 30000 Jimmy Carter
D102 13456 30001 Paul Simon

Second Normal Form

Example:

The following relation is not in 2NF. The relation has the following FD:

Student_ID, Course -> Grade
Course -> Credit

Note the redundancy and anomalies.

Student_ID Course Credit Grade
S1 CSCI 5333 3 A
S1 CSCI 4230 3 A
S2 CSCI 5333 3 B-
S2 CSCI 4230 3 C
S3 CSCI 5333 3 B+

Third Normal Form

Example:

The example relation for anomalies is not in 3NF.

EMPLOYEE(EMP_NO, NAME, DEPT_NO, MANAGER_NO).

with the following assumptions:

An instance of the relation:

EMP_NO NAME DEPT_NO MANAGER_NO
10000 Paul Simon D123 54321
20000 Art Garfunkel D123 54321
13000 Tom Jones D123 54321
21000 Nolan Ryan D225 42315
22000 Magic Johnson D225 42315
31000 Carl Sagan D337 33323

Example:

Consider R(A,B,C) with the minimal cover F: {A -> B}. Note that F |- B -> B, or B -> Bis in F+. For B -> B, B is not a superkey and B is non-prime. However, B -> B is not a violation of 3NF as it is trivial and should not be considered for potential violation.

Example:

Consider the relation

S(SUPP#, PART#, SNAME, QUANTITY) with the following assumptions:

(1) SUPP# is unique for every supplier.
(2) SNAME is unique for every supplier.
(3) QUANTITY is the accumulated quantities of a part supplied by a supplier.
(4) A supplier can supply more than one part.
(5) A part can be supplied by more than one supplier.

We can find the following nontrivial functional dependencies:

(1) SUPP# --> SNAME
(2) SNAME --> SUPP#
(3) SUPP# PART# --> QUANTITY
(4) SNAME PART# --> QUANTITY

Note that SUPP# and SNAME are equivalent.

The candidate keys are:

(1) SUPP# PART#
(2) SNAME PART#

The relation is in 3NF.

However, the relation has unnecessary redundancy:

SUPP# SNAME PART# QUANTITY
S1 Yues P1 100
S1 Yues P2 200
S1 Yues P3 250
S2 Jones P1 300

We need a higher normal: BCNF. The relation above is not in BCNF.



Dr. Kwok-Bun Yue
Professor, Computer Science and Computer Information Systems
Chair, Division of Computing and Mathematics
University of Houston-Clear Lake
2700 Bay Area Boulevard
Houston, TX 77058
Yue's Home  Yue's home page     Yue's email  yue@uhcl.edu     phone  281-283-3864