Good and Bad Database Design
by K. Yue
1. Introduction
Example:
Consider the relation
EMPLOYEE(EMP_NO, NAME, DEPT_NO, MANAGER_NO).
Assumptions:
EMP_NO | NAME | DEPT_NO | MANAGER_NO |
10000 | Lady Gaga | D123 | 54321 |
12000 | Eminem | D123 | 54321 |
13000 | Lebron James | D123 | 54321 |
21000 | Rajiv Gandhi | D225 | 42315 |
22000 | Bill Clinton | D225 | 42315 |
31000 | John Smithson | D337 | 33323 |
Problems:
Update Anomalies:
(a) 44444 is now the manager of department D123
EMP_NO | NAME | DEPT_NO | MANAGER_NO |
10000 | Lady Gaga | D123 | 44444 |
12000 | Eminem | D123 | 44444 |
13000 | Lebron James | D123 | 54321 |
21000 | Rajiv Gandhi | D225 | 42315 |
22000 | Bill Clinton | D225 | 42315 |
31000 | John Smithson | D337 | 33323 |
(b) Bill Clinton now works for department D337:
EMP_NO | NAME | DEPT_NO | MANAGER_NO |
10000 | Lady Gaga | D123 | 54321 |
12000 | Eminem | D123 | 54321 |
13000 | Lebron James | D123 | 54321 |
21000 | Rajiv Gandhi | D225 | 42315 |
22000 | Bill Clinton | D337 | ????? |
31000 | John Smithson | D337 | 33323 |
The command:
UPDATE Employee
SET DEPT_NO = 'D337'
WHERE NAME = 'Bill Clinton';
will produce inconsistent result.
Insertion Anomalies:
Create a new department D777, with manager 77111, currently with no employee is not possible.
EMP_NO | NAME | DEPT_NO | MANAGER_NO |
10000 | Lady Gaga | D123 | 54321 |
12000 | Eminem | D123 | 54321 |
13000 | Lebron James | D123 | 54321 |
21000 | Rajiv Gandhi | D225 | 42315 |
22000 | Bill Clinton | D225 | 42315 |
31000 | John Smithson | D337 | 33323 |
???? | ???? | D777 | 77111 |
Deletion Anomalies:
John Smithson no longer works here. Result: The information that 33323 is the manager of department D337 is lost.
EMP_NO | NAME | DEPT_NO | MANAGER_NO |
10000 | Lady Gaga | D123 | 54321 |
12000 | Eminem | D123 | 54321 |
13000 | Lebron James | D123 | 54321 |
21000 | Rajiv Gandhi | D225 | 42315 |
22000 | Bill Clinton | D225 | 42315 |
A standard way of resolving these issues are by proper decomposition: breaking down a relation into 2 or more relations.
Example:
EMP(EMP_NO, NAME, DEPT_NO)
EMP_NO | NAME | DEPT_NO |
10000 | Lady Gaga | D123 |
12000 | Eminem | D123 |
13000 | Lebron James | D123 |
21000 | Rajiv Gandhi | D225 |
22000 | Bill Clinton | D225 |
31000 | John Smithson | D337 |
DEPARTMENT(DEPT_NO, MANAGER_NO)
DEPT_NO | MANAGER_NO |
D123 | 54321 |
D225 | 42315 |
D337 | 33323 |
To obtain the original relation EMPLOYEE(EMP_NO, NAME, DEPT_NO, MANAGER_NO) from
EMP(EMP_NO, NAME, DEPT_NO)
DEPARTMENT(DEPT_NO, MANAGER_NO)
Relational algebra:
EMPLOYEE = EMP |x| DEPARTMENT
SQL:
SELECT EMP.*, MANAGER_NO INTO EMPLOYEE
FROM EMP, DEPARTMENT
WHERE EMP.DEPT_NO = DEPARTMENT.DEPT_NO
Methods for good designs
Integrity Rules:
Examples:
General Integrity Rules:
No component of a candidate key of a relation can have a null value.
Meaning: In a relational database, a tuple that cannot be identified will not be recorded.
Example:
EMPLOYEE(EMP_NO, NAME, DEPT_NO, SALARY)
EMP_NO | NAME | DEPT_NO | SALARY |
10000 | Lady Gaga | D123 | 550000000 |
12000 | Eminem | D123 | 10000000 |
13000 | Lebron James | D123 | 50000000 |
21000 | Rajiv Gandhi | @ | @ |
@ | Bill Clinton | D225 | 2000000 |
@ | John Smithson | D337 | 70000 |
Referential Integrity Rule
Example 1:
EMP(EMP_NO, NAME, DEPT_NO)
EMP_NO | NAME | DEPT_NO |
10000 | Lady Gaga | D123 |
12000 | Eminem | D123 |
13000 | Lebron James | @ |
21000 | Rajiv Gandhi | D225 |
22000 | Bill Clinton | D225 |
31000 | John Smithson | D337 |
DEPARTMENT(DEPT_NO, MANAGER_NO)
DEPT_NO | MANAGER_NO |
D123 | 54321 |
D225 | 42315 |
D337 | 33323 |
EMP(EMP_NO, NAME, DEPT_NO)
EMP_NO | NAME | DEPT_NO |
10000 | Lady Gaga | D123 |
12000 | Eminem | D123 |
13000 | Lebron James | D123 |
21000 | Rajiv Gandhi | D225 |
22000 | Bill Clinton | D225 |
31000 | John Smithson | D337 |
32000 | Bun Yue | D119 |
DEPARTMENT(DEPT_NO, MANAGER_NO)
DEPT_NO | MANAGER_NO |
D123 | 54321 |
D225 | 42315 |
D337 | 33323 |