Converting ER Diagrams to Relation Schema
by K. Yue
1. Introduction
- ER models need to be converted to the underlying data model.
- For Relational DB, they need to be converted to relational schema.
2. Mapping of ER Diagrams to the Relational Model
- The problem is: Source Model --> Target Model.
- Source Model: ER
- Basic structures: entities, relationships and attributes.
- Auxiliary structures: candidate keys, primary keys, data types, cardinality, nullability, etc.
- Not all ER models are the same: different notations, extensions and guidelines.
- Target Model: Relational
- Basic structures: relations (tables) and attributes (columns).
- Auxiliary structures: candidate keys, primary keys, data types, nullability, indexes, constraints, etc.
- Different RDBMS provide different features.
- There are no universal mapping rules.
- Rules (of thumbs) for conversion are shown below.
- You need to understand the rationale behind these rules so you can adapt them to create your own mapping rules for a specific project which has a unique set of:
- requirements
- source ER models
- target RDBMS
Entities:
E1. An entity is mapped to a relation.
- The relation may use the same name.
- Two relations may be merged into a single relation later on.
Attributes:
A1: A single-valued attribute of atomic data type (of an entity or relationship) is mapped to an attribute in the corresponding table.
A2: For a multiple-valued attribute A of an entity E or a relationship L that is implemented as a relation R, A is implemented by a new relation R_A(RId, A), where RId is the primary key of the relation R.
- (RId, A) is a candidate key.
- RId is a foreign key references R(RId).
- You may create a surrogate key, such as R_A_Id. If so, there will be two candidate keys: (1) R_A_Id, (2) (RId, A)
- The name of the new relation R_A should be meaningful, such as with the mentioning of A and (E or L).
A3. A single-value attribute of composite data type can be implemented in two ways (not covered)
- A3a. If there is a suitable composite data type in the targeted RDBMS, implement it as an attribute with the targeted composite data type. (Note that first normal form is now not satisfied.)
- A3b. Otherwise, treat it as a multi-valued attribute and use Rule A2.
A4: For derived attribute A, there are usually two ways(not covered):
- A4a. Implement as an attribute: mechanism such as triggers or stored procedures should be used to ensure the correctness of the derived values.
- A4b. Not implement as an attribute: mechanism such as stored functions or views can be used.
Keys and Constraints
KC1. If a relation R implements an entity E or a relationship L, and E or L has candidate keys K's, set K's as candidate keys in R.
KC2. If a relation R implements an entity E or a relationship L, and E or L has no candidate key, create a surrogate candidate key SK for R.
- This is needed since every relation must have a candidate key.
KC3. An UNIQUE constraint should be specified for candidate keys. Each attribute of a candidate key should also be set to NOT NULL.
KC4. Select one candidate key as the primary key and set it as the primary key in the relation.
KC5. If the cardinality of a single-valued attribute is specified:
- KC5a. If 0 is allowed, add NULL into the column definition. Note that this may not be necessary as NULL is usually the default.
- KC5b. If 0 is not allowed, add NOT NULL into the column definition.
KC6. If a constraint in the ER diagram (including those specified in comments and other documentation) can be implemented by the SQL Data Definition Language (DDL), do so. Otherwise, be mindful that they will need to be implemented somewhere else.
- All requirements should be implemented.
Relationships
R1. For a many to one relationship between E1 (the entity with the one cardinality) and Em, add a column E1_Id into the relation Rm (which implements Em).
- Assume that R1_Id is the primary key of the relation R1.
- R1_Id is a foreign key of Rm references R1(R1_Id).
- The name R1_Id may need to be renamed.
- If 0 is not allowed (i.e. 1..1) for E1, then R1_Id is not null in Rm.
- Any single valued attributes of the relationship is mapped to a column in Rm.
- If you have composite or multi-valued attributes of the relationship, you should consider promoting the relationship to an entity in your ER model.
R2. For a many to many relationship between entities EA and EB, create a new relation RAB(RA_Id, RB_Id).
- (RA_Id, RB_Id) is a candidate key.
- RA_Id references RA(RA_Id).
- RB_Id references RB(RB_Id).
- An additional surrogate key, such as RAB_Id, can be created.
R3. For a one to one relationship between entities EA and EB, there are several options:
- Treat EA as E1 and EB as Em and apply R1.
- Treat EA as Em and EB as E1 and apply R1.
- Merge the two relations RA and RB into one. (In this case, you may want to refactor the ER diagram.)
R4. For any n-ary relationship (n>2), a new relation is needed.
- You should consider using only binary relationships.
Example:
We will discuss classroom examples.