Introduction to the relational model

by K. Yue

1. Introduction

The relational model is based on the concept of a relation.

Important definitions:

Example:

A tuple: a set of mapping.

{SID: 123456, SNAME: "Johnson, Gary", MAJOR: "CSCI", GPA: 3.71}

The same tuple:

{SNAME: "Johnson, Gary", MAJOR: "CSCI", SID: 123456, GPA: 3.71}

Other possible representations:

(SID: 123456, SNAME: "Johnson, Gary", MAJOR: "CSCI", GPA: 3.71)

If the order for displaying the attribute names is understood:

(123456, "Johnson, Gary", "CSCI", 3.71)

Consider the very simple database: supply.sql.txt

MariaDB [supply]> select * from supplier;
+------+-------+---------+--------+
| SNUM | SNAME | SCITY   | STATUS |
+------+-------+---------+--------+
| S1   | ABC   | Dallas  |     10 |
| S2   | DEF   | Houston |     20 |
| S3   | Go go | Houston |     12 |
| S4   | P&G   | Dallas  |      2 |
| S5   | Yap   | Phoenix |      5 |
| S6   | Yue   | Dallas  |      1 |
+------+-------+---------+--------+
6 rows in set (0.04 sec)

MariaDB [supply]> select sname, snum, status, scity
    -> from supplier
    -> order by scity;
+-------+------+--------+---------+
| sname | snum | status | scity   |
+-------+------+--------+---------+
| ABC   | S1   |     10 | Dallas  |
| P&G   | S4   |      2 | Dallas  |
| Yue   | S6   |      1 | Dallas  |
| DEF   | S2   |     20 | Houston |
| Go go | S3   |     12 | Houston |
| Yap   | S5   |      5 | Phoenix |
+-------+------+--------+---------+
6 rows in set (0.00 sec)

 

  1. The two results are relations.
  2. They look different. They have different display views.
  3. However, the information stored is the same. In the pure relational model, the two relations are equal.

Notes:

Example: Different notations.

Student: {SID, SNAME, MAJOR, GPA}

or

Student(SID, SNAME, MAJOR, GPA)

Example:

A relation schema:

Student(SID, SNAME, MAJOR, GPA)

An instance of the relation:

SID SNAME MAJOR GPA
12345 Johnson MATH 3.33
13526 Yue CSCI 2.11
20010 Smith CSCI 3.94
10110 Bajaj MATH 3.22
11111 Ganta CSCI 3.78

Another instance of the relation:

SID SNAME MAJOR GPA
12345 Johnson MATH 3.55
13526 Yue CSCI 3.61
10110 Bajaj MATH 3.22
11111 Ganta CSCI 3.98

Consider the very simple database: supply.sql.txt

Some observations:

Null values:

Example:

Consider the attribute SpouseName. A Null value may mean:

  1. Not applicable: the person is not married.
  2. Missing information: the person is married and has a spouse, but we do not have the name of the spouse.
  3. No information at all: we do not know whether the person is married or not.

How do we distinguish between the three meanings of the null value in this case?

By using an extra attribute, such as MaritalStatus.

Note that the Boolean type (if supported) actually uses a 3-valued logic.

2. Keys

Example:

Consider the following valid relation instance:

A B C
12345 101 A
13098 201 B

Can you identify a candidate key?

Consider another valid relation instance.

StudentID Course ID Grade
12345 101 A
13098 201 B

Can you identify a candidate key?

What about this?

StudentID Course ID Grade Some other columns
12345 101 A  
13098 201 B  
13098 101 C-  
12345 201 D  

Example:

Given the following instance known to be valid:

StudentID Course ID Grade Some other columns
12345 101 A  
13098 201 B  
13098 101 C-  
12345 201 D  

The following sets of attributes cannot be CK: [1] StudentId, [2] CourseId.

More terms:

Example:

Consider Employee(EmployeeID, EmployeeName)

If we make these assumptions:

  1. An employee has an unique ID.
  2. An employee's name is not unique.

We can conclude:

  1. EmployeeID is a candidate key of Employee.
  2. {EmployeeID, EmployeeName} is not a candidate key of Employee: it is unique but not minimal. It is a superkey.
  3. Prime attribute: EmployeeID
  4. Non-prime attribute: EmployeeName

Example:

Consider R(EmployeeID, DepartmentID)

If we make these assumptions:

  1. An employee has an unique ID.
  2. A department has an unique ID.
  3. An employee is associated with (works for) 0 or 1 department.
  4. A department is associated with (may have) many employees.

We can conclude that EmployeeID is the only candidate key for R.

Under a different set of assumptions, we may have different candidate keys

Assumptions:

  1. An employee has an unique ID.
  2. A department has an unique ID.
  3. An employee is associated with 0 or 1 department.
  4. A department is associated with only 0 or 1 employee.

Then we have two candidate keys:(1) EmployeeId and (2) DepartmentID. The relation R may be called DepartmentHead.

Exercise:

A relation R has an arity of 5. What are the possible minimum and maximum number of superkeys of R? . What are the possible minimum and maximum number of candidate keys of R?

What about the general case?

Example:

Prove that a relation always has a candidate key.

Proof:

Since a relation is a set of tuples, the relation schema is always unique. If we can find a proper subset of relation schema that is a candidate key, we have found a candidate key. If not, the relation schema is minimal and is thus a candidate key.

Example:

Consider R(A,B,C) with two candidate keys A and B and S(A, D, E, F) with a single candidate key AD. A is a foreign key in S and the base table of A is R.