Database Modeling
by K. Yue
1. Resources
UML:
UML and DB Modeling:
2. Introduction to DB Modeling
- Why DB modeling?
- Many developers just figure out the relational schema directly. Do you see any problem with that?
- Initial ideas may be need-based. Their definitions can be very rough and ambiguous.
- "I want a system to store and retrieve sale receipts for small business to claim sale tax waiver."
- "I will come back on 12/1/1946..."
- What are involved in DB modeling?
- Understand, represent and refine until it is clear and detailed enough for implementation.
- Communications are crucial. "Do not build your DB application. Build the DB application of your users."
- Complexity management (try to compare problem solving not involving computers):
- modeling language
- best practice and theory
- process
Classroom Exercise
The CS and CIS programs want to have a Web-based application for teaching assistant applications. Ask questions so you can better understand the problem. Prepare a list of questions.
Understanding the problem domain
How do people solve problems?
- Get an understanding of the problem.
- Form a model to capture the understanding.
- Based on the model, devise a solution.
- Implement the solution.
The problem of many novices is that they do not spend enough time and effort in steps (1) and (2) (i.e. modeling) and jump to (3) and (4) quickly.
- Your supervisor/customer may not have a clear model themselves!
- In most cases, for budding software developers: what you think what the applications should be is NOT important!
Examples:
- "I need a class enrollment report."
What you can do to understand the problem and model it:
- Start by asking a lot of questions.
- Write!
- Use your own words and avoid the 'copy and paste' trap. (What is the copy and paste culture?)
- Find out what is currently being used. Use them with the users.
Some possible questions to ask for modeling.
- What are the purposes of the application?
- What problems will the application solve? What value will it provide?
- What are the boundary and context of the applications?
- Who are the users?
- Who are the domain experts?
- What are the available documentation?
- How does the existing system work?
Two important options for database modeling:
- Entity Relationship (ER) Modeling and Extended ER (EER) Modeling: more database specific.
- UML modeling: more general purpose.
Advantages of ER diagrams:
- Easy to understand
- Simpler
- More specific to database modeling
- Good collection of theory and best practices
- Good vendor and tool support
Disadvantages of ER diagrams:
- Expressiveness
- Not concise
- Many versions can be confusing.
What do we look for in a modeling language?
- Completeness: can it capture all needed information
- Expressiveness: can it capture information in an efficient and easy way for discussion and communication.
- Domain support: how well does it support capturing the information in the application domain.
- Ambiguity
- Conciseness
- Standardization
- Tool support
- Community support
We will focus on UML in this course. You need to know how to construct equivalent ER diagrams.
Read: A Badia and D. Lemire. A call to arms: revisiting database design. SIGMOD Record 40, 3 (November 2011), 61-69.