Data Model 数据模型
- Data Model 数据模型
- 2.1 Hierarchical Data Model
- Basic Idea
- Hierarchical Data Schema
- Virtual Record
- 2.2 Network Data Model
- 2.3 Relational Data Model
- Basic Idea
- 相关术语
- Relation,Tuple,Attribute,Column,Domain
- Primary Key
- Foreign Key, Reference Integrity
- Relational Algebra关系代数
- Projection
- Selection
- Union, Intersection, Set-Difference
- Cross-Product
- Joins
- Division
- Outer Joins
- Outer Union
- Relational Calculus 关系演算
- 2.4 ER Data Model
- Basic Idea
- ER Diagram
- Constrains
- Advanced Topics
- 2.5 Object-Oriented Data Model
- 2.6 Other Data Model
- 2.7 Summery
2.1 Hierarchical Data Model
Basic Idea
Basic Idea: because many things in real world are organized in hierarchy, hierarchical model managestodescribe real world in a tree structure.
- Record:表示现实世界中的一个实体(老师)
- Field:表示实体所有的属性(年龄,职称…)
- Parent-Child relationship (PCR): the most basic data relationship in hierarchical model. It expresses a 1:N relationship between two record types
Hierarchical Data Schema
- A hierarchical data schema consists of PCRs.
- Every PCR expresses one 1:N relationship
- Every record type can only have one parent
Virtual Record
It’s hard to use PCR to express Multiple Parents, M:N relationship and N-ary relationship
virtual record is included to solve this problem, it’s a pointer in fact
2.2 Network Data Model
- Record and data items: data items are similar as field in hierarchical model, but it can be vector.
- Set : express the 1:N relationship between two record types.
- LINK record type: used to express self relationship, M:N relationship and N-ary relationship
- It breakthrough the limit of hierarchical structure, so can express non-hierarchical data more easy.
2.3 Relational Data Model
Basic Idea
Basic Idea:实体用表“table”表示,实体间的关系也用表“table”表示,运算查询的结果也是表“table”,形成了一个封闭空间,可以用数学方法研究数据库问题
Advantages:
✓ Based on set theory, high abstract level
✓ Shield all lower details, simple and clear, easy to understand
✓ Can establish new algebra system——relational algebra
✓ Non procedure query language——SQL
✓ Soft link ——the essential difference with former data models(用表表达了指针的含义)
相关术语
Relation,Tuple,Attribute,Column,Domain
Primary Key
A set of attribute is a candidate key for a reason for:
- No two distinct tuples can have same values in this set of attributes(唯一性)
- This is not true for any subset of this set of attributes(极小性)
满足唯一性而不满足极小性的attribute集合称为Super Key【比如学号是Primary Key, 那么(学号,姓名)就是Super Key】
如果有多个PrimaryKey,则一个叫做PrimaryKey,其他成为Alternate Key
如果一个PrimaryKey包含了Rlation里的所有Attribute,这种PrimaryKey就称为ALL Key
Foreign Key, Reference Integrity
- Foreign key : Set of attributes in one relation that is used to ‘refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a ‘Logical pointer’, soft link
- E.g. Enrolled(sid: string, cid: string, grade: string)
➢ sid is a foreign key referring to Students:
➢ If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references.【意味着级联删除:如果Student中的一个tuple被删除了,该tuple的sid又存在了Enrolled中,那么系统会自动把Enrolled中对应该sid的tuple也删除】
Relational Algebra关系代数
五种基本运算在关系代数的闭合空间中是完备的,其他运算都可以由他们导出
Intersection | 交集 |
---|---|
EXCEPT | 差集(同Set-difference) |
Join | 连接 |
Division | 除法 |
用例:
Projection
在关系代数的运算中,结果的Relation会删除重复的tuple,因为得到的Relation来看,这些tuple的含义是一致的,没有必要重复,故删除
但是实际的数据库产品往往不这么做,因为有时候用户需要用到这些重复的tuple,比如要计算水手的平均年龄,所以不删除
Selection
Union, Intersection, Set-Difference
Cross-Product
笛卡尔积
Joins
最常用的是 Natural Join
Division
Division不是一个原子操作,但可以由原子操作导出
思路是“否定之否定”
【找到所有x,满足该x不会attach到B中的y,然后把这些x从全集中删掉】
Outer Joins
the extension of join. In join operation, only matching tuples fulfilling join conditions are left in results. Outer joins will keep unmated tuples, the vacant part is set Null:
-
Left outer join(*⋈)
Keep all tuples of left relation in the result.
-
Right outer join (⋈*)
Keep all tuples of right relation in the result
-
Full outer join (⋈)
Keep all tuples of left and right relations in the result
Outer Union
- The extension of union operation. It can union two relations which are not union-compatible.
- The attribute set in result is the union of attribute sets of two operands
- The values of attributes which don’t exist in original tuples are filled as NULL
Relational Calculus 关系演算
分类:Tuple relational calculus (TRC)、Domain relational calculus (DRC)
Expressions in the calculus are called formulas【公式】. An answer tuple is essentially an assignment of constants to variables that make the formula evaluate to true
-
ppt原图
2.4 ER Data Model
Basic Idea
-
Entity(实体):现实世界的事物,用一组Attribute表示(员工)
-
Entity Set(实体集):相似实体的集合(全体员工)
-
所有实体有相同的attribute
-
每个实体有一个key
-
每个attribute有一个domain
-
允许attribute是复合类型或者多值【突破了1NF(一范式)】
允许了复合类型
-
-
Relationship:两个或者多个实体间的关联【e.g Attishoo works in Pharmacy department】
- Relation也可以有attribute
-
Relationship Set:相似关系的集合
- An n-ary【n元的】 relationship set R relates【关联】 n entity sets E1...En; each relationship in R involves entities e1, ..., en
- Same entity set could participate in different relationship sets, or in different “roles” in same set.
ER Diagram
Constrains
Advanced Topics
-
Weak Entity【职工和家属,家属依附于职工而存在,家属就是弱实体】
-
Specialization and Generalization【类似于面向对象中的继承】
-
Aggregation【聚集】:允许我们把关系集视为对象集来加入另一个关系集
aggregation:把合作这个relationship看做是实体,让它和其他实体发生联系
-
Category:允许我们把不同类型的实体加入同一个实体集,这样的实体集又叫做杂交集【hybrid entity set】
2.5 Object-Oriented Data Model
-
The shortage of relational data model
-
Break through 1NF
-
Object-Oriented analysis and programming
-
Requirement of objects’ permanent store
考虑的是如何将对象永久存储,在需要的时候再调入内存
-
Object-Relation DBMS
允许用户定义自己的数据类型,突破了1NF,但不能称之为面向对象数据库模型
-
Native (pure) Object-Oriented DBMS
纯面向对象模型几乎消亡