ER Modeling
by K. Yue
1. Introduction to DB Modeling
- Why DB modeling?
- Many DB developers just directly figure out the relational schema without formal data modeling.
- Do you see any problem with that? Not a big problem if the problem is super simple.
- What are involved in DB modeling?
- Understand, capture, and refine the application requirement until it is clear and detailed enough for design and implementation.
- Communications are crucial. "Do not build your DB application. Build the DB application for the users."
- Complexity management (try to compare problem solving not involving computers). Tools and techniques:
- modeling language
- best practice and theory
- process
What you can do to understand the problem and model it?
- Start by asking a lot of questions.
- Collect as much documentation as possible.
- Study and use the existing system: find out what is currently being used. Use them with the users.
- Participate in the work process.
- Model, write, and document!
- Use your own words and avoid the 'copy and paste' trap. (What is the 'copy and paste' culture?)
Some possible questions to ask for modeling.
- What are the purposes of the application?
- What problems will the application solve? What value will it provide?
- What are the boundary and context of the applications?
- Who are the users?
- Who are the domain experts?
- What are the available documentation?
- How does the existing system work?
- What are the workflow processes of the current system?
Prepare your own list of questions.
Two important options for database modeling:
- Entity Relationship (ER) Modeling and Extended ER (EER) Modeling: more relational database specific.
- UML modeling: more general purpose.
2. ER Modeling and ER Diagrams
- Some advantages of ER diagrams:
- Easy to understand
- Simple
- More specific to database modeling
- Good collection of theory and best practices
- Good vendor and tool support
- Good DBMS support
- Some Disadvantages of ER diagrams:
- Limited expressiveness
- Not concise
- Can be ambiguous
- No standards: many versions that can be confusing
- Mostly for relational database only.
- There are many ER diagram tools, with different relative merits.
- For our course, download and install "EA Assistant v2.1" from http://highered.mheducation.com/sites/0072942207/student_view0/e_r_assistant.html. This is one of the most simple one.
- In ER modeling, requirements are captured, such as business rules, and modeled using ER constructs.
ER Diagrams
- Symbols used in traditional ER Diagrams:
- Entity: rectangle
- Relationship: diamond
- Attribute: oval
- There are Links between entity and relationship: line
- Entities are 'objects that exists and that can be distinguished from other objects.'
- Entities can be anything.
- Examples: person, place, event, object, concept, etc.
- Entities usually have properties that are represented by attributes.
- Note the difference between:
- Entity type: the type of an entity. E.g. the entity type Student.
- Entity set: a set of entities of the same type. E.g. a set of entities containing students in the class ITEC 3333 in Fall 2017.
- An entity: an instance of one entity. E.g. a student S1.
- Eventually, when the ER model is implemented in a relational model:
- entity type -> relation schema
- entity set -> relation instance
- entity -> a row in a relation
- Compare these concepts with Java's class and object.
- In a ER diagram, the rectangle may represent an entity type that you can create an entity instance to be stored in an entity set.
Example:
Using ER Assistant, we may have:
Note:
- We will demonstrate how to draw the ER diagram in ER Assistant in class.
- It does not use the traditional ER notations, such as in Ricardo.
Student is an entity type that you can use to instantiate Student objects. Examples: we may have the following instances.
S1: Sadegh Davari
S2: Brad Pitts
S3: Stephen Curry
Each student entity represents a student.
Class is an entity type that you can use to instantiate Class entity. Examples: we may have the following instances.
C1: CSCI 4333
C2: CSCI 3388
C3: CINF 3311
Each course entity represents a course.
The relationship Enroll represents association between a student and a class. The meaning: a student may enroll in a class. For example, we may have:
S1 enrolls in C1
S1 enrolls in C3
S2 enrolls in C1
S3 enrolls in C2
S3 enrolls in C3
The relationship is actually not between two entity types. The relationship is between two entities of some specific types.
In fact, a relationship may be between two entities of the same type.
Example:
Toyu: A drastically simplified university.
ER Diagram:
The diagram is prepared by MySQL Workbench.
PVFC in Microsoft Visio:
Example:
Consider (created by ER Assistant):
The meaning of the relationship "is a prerequisite": one class entity can be a prerequisite of another class entity. For example, we may have:
C1 is a prerequisite of C2
C3 is a prerequisite of C2
- An attribute presents a property of an entity.
- A multi-valued attribute may have many values.
- The name of the attribute is usually in plural.
- Example: Hobbies, Names, Honors, Medals, etc.
- Example: Names: a DB problem may need to store all names (or alias) of a person. 'Bun', 'Bun Yue', 'K.B. Yue', etc.
- A derived attribute has a value that can be computed (derived) from other attributes (that may be in another relations).
- In the traditional ER model:
- ordinary attribute: oval.
- multi-valued attribute: double oval
- derived attribute: dashed oval
- ER Assistant does not support notations for multi-valued attribute or derived attribute.
- Multi-valued attributes: consider promoting the attribute to an entity.
- Derived attributes: document them in the comment section.
- Relationships are connections, associations, or interactions between entity instances.
- Note the difference between:
- relationship type: what entities and their types participate in the relationship.
- relationship set: a set of relationships of a specific type
- relationship instance: a relationship between two entity instances.
Example:
Using ER Assistant, we may have:
The relationship type Enroll allows a student entity to has an Enroll association with a class entity.
Here are five relationship instances of Enroll:
r1:S1 enrolls in C1
r2:S1 enrolls in C3
r3:S2 enrolls in C1
r4:S3 enrolls in C2
r5:S3 enrolls in C3
Together, they form a relationship set.
- There are different degrees of relationship
- binary – links two entities
- ternary – links three entities sets
- N-ary – links n entities
- EA-Assistant only supports binary relationship.
- Relationship cardinality (or multiplicity): number of entity instances to which another entity can map under the relationship.
- For binary relationship:
- one to one
- one to many
- many to many
- For ternary relationship, the cardinality cannot be drawn in Crow's Foot notation.
- A ternary relationship may be replaced by
- two binary relationships: e.g. Figure 3.8.
- an entity and three binary relationships, e.g. project assignment between employee, project and role.
- In general, avoid n-ary relationship, where n>2.
Example:
- Zero or many employees work for one department.
- Assumptions:
- A department may have no employee.
- An employee must work with one department.
- Zero or many employees work for one and only one department.
- Assumptions:
- A department may have no employee.
- An employee must work with exactly one department. if an employee does not work for any department, it cannot be added to the database.
- What are the assumptions made below?
Example:
Discuss the ER model for students in an university and its relationship with degree program (e.g. B.Sc. in CIS, BA in Math, MS in CS, etc).
You may start with:
- What should the cardinality be?
- Relationship may have entity role: how the entity participate in the relationship.
- Roles are not supported by EA-Assistant.
- Kind of participation:
- Total participation
- Every entity must participate in the relationship.
- Represented by double line from entity rectangle to relationship diamond in some notations.
- Recommend not to use (2).
- Partial participation
- An entity instance may or may not participate in the relationship.
- Note that this classification can be captured by the multiplicity of the relationship:
- Total participation: 0 not allowed (1..1 or 1..*)
- Partial participation: 0 allowed (0..1 or 0..*)
Total participation:
Partial participation:
- Other relevant concepts are listed below.
- Existence dependence: same as total participation, but may be more forceful. Example:
- If an employee must work for a department, this is total participation.
- It can also be interpreted as "existence dependent": an employee cannot exist (in the database) without a department.
- Semantically, in the real world, employee can exist independent of a department. It may just not exist in the database.
- If Y is existence dependent of X, and Y does not have its own candidate keys, then Y is a weak entity and X is a strong entity.
- Weak entity may have a partial key, a discriminator, that uniquely identifies a weak entity among those related to the same strong entity.
- In some notations, use double rectangle for weak entity, with double diamond for relationship connecting it to its strong entity (e.g. Ricardo).
- Recommendation: don't use it. This is because an entity can be a weak entity in one relationship and a strong entity in another relationship.
Notes:
- Do not confuse an ER diagram with a relationship diagram (usually provided by DBMS).
- A relationship diagram shows the relations in a database and their relationships through foreign keys.
Example:
Note that:
|
Entity-Relationship Diagram |
Relationship Diagram |
Rectangle |
Entity |
Relation |
Link |
Relationship between entities; can be n-ary. |
Foreign key relationship between relations; always binary. |