Mapping UML Class Diagrams to the Relational Model
by K. Yue
1. Resources
UML and DB Modeling:
2. Mapping OO Model to the Relational Model
- Once an OO data model is constructed, it needs to be mapped for implementation in the selected database model.
- If relational DB is used, the mapping will be from OO (classes, attributes, associations, etc.) to relational schema (relations, attributes, keys, etc.)
- Note that the relational model and the OO model are very different.
- CASE tools or DB modeling tools may provide varying degrees of facilities for generating relational schema and corresponding SQL statements.
- However, it is important to understand the mechanism a tool uses to generate the relational schema and make adjustment if needed.
The problem is: Source Model --> Target Model.
- Source Model: UML
- Basic structures: classes, associations, and attributes.
- Auxiliary structures: objects, multiplicity, data types, default values, stereotypes, candidate keys, primary keys, nullability, etc.
- Target Model: Relational
- Basic structures: relations, and columns (attributes).
- Auxiliary structures: rows (tuples), candidate keys, primary keys, data types, nullability, indexes, constraints, etc.
- Different RDBMS provide different features.
3. Conversion Rules
Classes
C1. A class C is mapped to a relation RC.
- Relations may later be merged in design refinement and performance tuning.
- The relation may use the same name as the class.
Attributes
ATT1. (Basic) Include all single-valued attributes (with simple data types) of a class C as the attributes of RC, the relation for the class C.
ATT2. (Basic) For each multi-valued attribute A of the class C, create a new relation RCA containing the attribute A and the primary key, RCId, of the relation RC (which implements the class C).
- (RCId, A) is a composite candidate key.
- RCId is a foreign key referencing RC(RCId)
- A surrogate key, such as RCA_Id, may be created to serve as a simple candidate key, to be selected as the primary key.
- The name of RCA should be meaningfully selected.
ATT3. A single-valued attribute of composite data type (such as set, list, array) can be mapped in various ways.
- If there is an comparable composite data type in the targeted DBMS, it can be implemented as an attribute of that data type in the relation.
- The relation will no longer be in the first normal form.
- Care should be taken in handling the difference in data type mapping.
- Otherwise, regard the attribute as a multi-valued attribute and apply rule ATT2.
ATT4. For a derived attribute A:
- It can be implemented and stored as an attribute of the relation.
- Mechanism, such as triggers or stored procedures, should be used to ensure data consistency. That is the derived column should be consistent with the data that derives its value.
- It may not be stored as a column directly in any relation.
- Mechanism, such as virtual columns, views or stored functions, may be used to provide standard access to the derived attributes.
ATT5. Data type matching should be handled effectively and consistently.
- In later modeling phases, one may use SQL data types of the targeted DBMS in the class diagram.
- If available, consider using user-defined data types in the targeted DBMS.
ATT6. If the multiplicity of an attribute is specified for the case of 0.
- If 0 is allowed, add the NULL specifier in the column definition. (NULL is usually the default)
- If 0 is not allowed, add the NOT NULL specifier in the column definition.
ATT7. The default value of an attribute may be directly implemented in SQL DDL.
Keys and Constraints
KC1. If a relation R implements a class C or an association (class) A, and C or A has candidate keys K's, set all K's as candidate keys in R.
KC2. If a relation R implements a class C or an association (class) A, and C or A has no candidate key, create a surrogate candidate key SK for R.
- This is needed as every relation must have at least one candidate key.
- SK is usually the primary key.
KC3. All candidate keys can be implemented by using the 'unique' and non-null constraint in SQL.
KC4. Select a candidate key as the primary key and set it accordingly in the relation.
KC5. For a stereotype:
- Some may be directly implemented in SQL DDL, e.g. PK, CK, unique, etc.
- Otherwise, it is necessary to consider where it is implemented.
Example:
Consider the class Member with the following attributes:
- Member_Id: <<PK>>
- Member_ScreenName <<unique>>
- Hobby*
- Medal* <<ordered>>

What may the relational schema look like?
- Member(MemberId, MemberScreenName):
- CK: [1] MemberId, [2] MemberScreenName
- Hobbies(HobbiesId, MemberId, Hobby):
- CK: [1] HobbiesId, [2] MemberId, Hobby
- FK: [1] MemberId references Member(MemberId)
- A surrogate key, HobbiesId, is created as the primary key.
- Medals(MedalsId, MemberId, Medal):
- CK: [1] MedalsId, [2] MemberId, Medal.
- FK: [1] MemberId references Member(MemberId)
- A surrogate key, MedalsId, is created as the primary key.
All columns in the table above are not nullable.
Example:
A class Rectangle has three attributes:
- Length
- Width
- \area: derived.
What may the relational schema look like?
One solution:
Rectangle(RectangleId, Length, Width)
with a view Rect define as
select RectangleId, Length, Width, Length * Width as Area
from Rectangle;
Alternatively, using virtual column:
CREATE or replace TABLE rectangle (
width DOUBLE,
height DOUBLE,
area DOUBLE AS (width * height) virtual
);
Associations
A1. (basic) For a many to one association between C1 (the class with the one multiplicity) and Cm, add a column C1_Id into the relation Rm (which implements Cm).
- Assume that R1_Id is the primary key of the relation R1.
- R1_Id is a foreign key of Rm referencing R1(R1_Id).
- The name R1_Id may need to be renamed.
- R1_Id is not null in Rm If and only if 0 is not allowed (i.e. 1..1) for C1.
- Any single valued attribute of the association is mapped to a column in Rm.
- If you have composite or multi-valued attributes of the relationship, you should consider promoting the association to an association class or a class in your UML class diagram.
A2. (basic) For a many to many association between classes CA and CB, create a new relation RAB(RA_Id, RB_Id).
- (RA_Id, RB_Id) is a candidate key.
- RA_Id references RA(RA_Id) as a foreign key.
- RB_Id references RB(RB_Id) as a foreign key.
- An additional surrogate key, such as RAB_Id, can be created.
A3. For a one to one association between classes CA and CB, there are several options:
- Treat CA as C1 and CB as Cm and apply A1.
- Treat CA as Cm and CB as C1 and apply A1.
- Merge the two relations RA and RB into one. (In this case, you may want to refactor the class diagram.)
A4. For any n-ary association (n>2), a new relation is needed.
- You should consider using binary associations instead.
Example:
Consider the classes Order and OrderItem which have a (1..1) to (0..*) association. The association has an attribute 'packed'.
What may the relational schema look like?
OrderItem(OrderItemId, ..., OrderId, packed):
- OrderId is a foreign key referencing Order(OrderId)
Example:
Consider the class User and Account, which has a one to one association.
What may the relational schema look like?
Example:
Consider the tertiary association between the classes Supplier, Part, and Warehouse with an association attribute quantity.
What may the relational schema look like?
Supply(SupplyId, SupplierId, PartId, WarehouseId, Quantity):
- CK: [1] SupplyId, [2] SupplierId, PartId, WarehouseId
- FK: [1] SupplierId references Supplier(SupplierId), [2] PartId references Part(PartId), [3] WarehouseId references Warehouse(WarehouseId).