Database
2018-06-21 14:13:20 2 举报
AI智能生成
INFO90002 Database University of Melbourne IT IS 2018 S1
作者其他创作
大纲/内容
Modelling
SQL
Normalisation
can prevent
data redundancy
update (and insert, delete) anomalies
biased optimizing UPDATE
slow down SELECT
1NF
multivalued attributes and repeating groups
2NF
partial functional dependencies
3NF
transitive dependencies
BCNF
remaining anomalies
cons:
more tables
need to be joined during SELECTs
denormalisation example
one-to-one relationship
many-to-many relationship with attributes <br>
reference data / lookup table
Physical Design
data type
text
CHAR
VARCHAR(M)
ENUM('V1','V2')
TEXT
number
TINYINT
255 <br>
SMALLINT
64k<br>
MEDIUMINT
16M<br>
INT
4G
BIGINT
2^64
DECIMAL
time
DATE
TIME
DATETIME
TIMESTAMP
YEAR
data integrity
default value
range control
null value control
referential integrity
indexing
on large tables
>100 distinct value
frequently used in WHERE, ORDER, GROUP BY
limited use for volatile db
Application
system architecture
presentation logic
business logic
storage logic
multi-tiered architecture
2 tiers
client-dbms
3 tiers
client-web server-dbms
4 tiers
client-web server-app server-dbms
Transaction
definition
A logical unit of work that must either be entirely completed <br>or aborted (indivisible, atomic) <br>
why
users need the ability to define a unit of work <br>
concurrent access to data by >1 user or program <br>
properties (ACID)<br>
Atomic
Consistency
Isolation
Durability
Transaction logging<br>
concurrent access
Lost Update problem <br>
Uncommitted Data problem <br>
Inconsistent Retrieval problem <br>
methods
serial execution (very expensive!)
Locking
Database-level lock<br>
Table-level lock<br>
(Page-level lock)<br>
Row-level lock<br>
Field-level lock (same row different attributes)<br>
Deadlock
two transactions wait for each other to unlock data <br>
methods: prevention, detection
Time Stamping
Optimistic
Distributed
definition
single logical database
physically spread
appear as though it is one database
Advantages
Good fit for geographically distributed organizations/users <br>
Data located near site with greatest demand<br>
Faster data access (to local data)<br>
Faster data processing <br>
Allows modular growth<br>
Increased reliability and availability<br>
Supports database recovery <br>
Disadvantages
Complexity of management and control<br>
Data integrity<br>
Security<br>
Lack of standards <br>
Increased training & maintenance costs <br>
Increased storage requirements <br>
Trade-offs
Location transparency
not need to know the location of data
Local autonomy
able to operate locally when connection fail
Availability vs Consistency
Synchronous vs Asynchronous
Distribution options<br>
Data replication <br>
Horizontal partitioning <br>
Vertical partitioning <br>
Combinations of the above <br>
Replication
Adv
High reliability
Fast access
avoid complicated distributed integrity routines<br>
Decoupled nodes don't affect data availability <br>
Reduced network traffic at prime time <br>
Dis
Need more storage space
Data integrity
Takes time for update operations <br>
Network communication capabilities<br>
Administration
DBMS
in memory and physically on disk
manage
data
performance
concurrency
recoverability
NoSQL
收藏
收藏
0 条评论
下一页