Introduction to the course
DASC 5333/CSCI 4333
by K. Yue
1. Promotion
- This course is (hopefully) one of the more useful CS/DS courses for students.
- World data is estimated to double every two years.
2. How to be successful in the course
General Course Suggestions:
- Course expectation is demanding.
- Please consider forming the habit to listen carefully and ask a lot of questions.
General Professionalism:
- Attitude
- Be considerate
- Be helpful and useful to others
- Be a good listener
- Responsive
- Hardworking
- Attention to details
- Focus: uni-tasking
Some general Tips:
- Engagement: Participate. Ask questions, a lot of them. Help others. Plan ahead.
- Preparation: start as early as possible and do not fall behind.
- Don’t copy and paste. Instead, copy, integrate, and apply.
- SEE-I: State, Elaborate, Exemplify and illustrate.
- Form good habits.
Some good traits of Computer and Data Scientists:
- Habits of trying to make sense of stuff.
- Intellectual curiosity.
- Tinkering and experimentation.
- Open minded, and not dogmatic.
- A large tool set.
3. Resources
- Companion materials of our textbook: please consult the course page in Canvas for additional resources related to the textbook.
- Contents of the course will be based mostly on
- Lecture notes posted in the course website: http://dcm.uhcl.edu/yue/courses/joindb/current/index.html.
- Classroom demonstrations.
- Assignments.
- Please read the appropriate pages in the textbook and lecture notes in this site before coming to the class.
- Document your learning. Bring a notebook to the class. Otherwise, it may be a good idea to print out the notes and bring them to the class so you can make notes during the class.
4. Introduction
- Persistent data is the backbone of many applications.
- Three main choices of storing persistent data:
- Files
- Databases: focus of this course.
- Cloud-based storage and database.
- Some advantages of DBMS (according to Ricardo, the optional textbook of this class):
- Sharing of data
- Control of redundancy
- Data consistency
- Improved data standards
- Better data security
- Improved data integrity
- Balance of conflicting requirements
- Faster development of new applications
- Better data accessibility
- Economy of scale
- More control over concurrency
- Better backup and recovery procedures
- How do we make sense of these 12 different advantages?
- Different textbooks may have different collections of the advantages of DBMS because of different classifications.
- No need to memorize them.
- Better to assimilate them and construct your own list.
- Make your own notes. Use SEE-I (In your own words, state, elaborate, and exemplify with examples, and illustrate the concept.)
- Learning through documentation, communications, and teaching.
- What are some disadvantages of DBMS?
- Complexity
- Cost
- Learning curve
- Single points of failure and bottleneck
5. A Simple Introduction to the Relational Model
- Relational databases are the most popular databases: https://db-engines.com/en/ranking. It is based on the relational model.
- There are many other data models.
- In layman's term: A table (relation) is the basic unit of a relational database.
- A table is composed of many rows.
- Each row has many column values.
- A primary key is roughly a minimal set of columns in a table that uniquely identify a row.
- Two tables can be related to each other by foreign keys. A foreign key is roughly a column in a table in which its value must be equal to the referenced value of the primary key in another table (called the parent table).
- Microsoft's Access is based on the relational model. It may be considered as a desktop relational DBMS.
- Relational DBMS is the most popular DBMS. Examples:
- SQL is the 'glue' in many DB systems.
Classroom discussion
Please ask questions about the toy University DB in Access (HW #1): toyu.accdb