SQL Transactions must be: ACID
- Atomic: Done or not done. Atomic
- Consistent: Don’t do partial writes
- Isolated: Transaction Serializability:= transactions should seem like they are executed in isolation
- Durable: Crashes should be recoverable
Isolation
Concepts
- Serial Schedule: execute transactions in order. Don’t interleave anything.
- Conflicting Operations: two transactions conflict if any of the operations they have on same data conflict iff one of the operations is a write operation
- Dirty Write: →
- Dirty Read: →
- & Two transactions are is conflict-serializable iff they have conflicts, but we can interleave some operations but still seems like it’s a serial schedule.
SEREALIZABLE
guarantee above is equivalent to this.
- Conflicting Transactions:
- See Isolation (database systems) - Wikiwand
- Non-repeatable Reads: reads → writes and commits → reads again (different value!)
- Phantom Reads: gets range → inserts between this range and commits → gets same range (different value!)
- Precedence graph: a schedule with no cycles in the precedence graph (=graph is DAG) is conflict-serializable
- A path in an acyclic precedence graph is a conflict-serializable schedule
- Example:
Levels of Isolation Guarantees
- Read Uncommitted: lock & release immediately
- Read Committed: don’t release write locks until commit
- Repeatable Reads: long duration locks
- Serializable: long duration locks on ranges
Serializable Guarantee
Implemented thru Strict 2-phase locking (S2PL). Rules:
- One writer, multiple readers: Readers-Writer Locking
- 2-Phase Locking: For each transaction, you must lock everything first (=locking phase) then unlock everything together (=unlock phase):
- Strict Locking: Release write-locks (=exclusive-locks) only at commit or abort time
- Guaranteed recoverable (no cascading rollbacks)
- Example:
- Rigorous (=Strong) Locking: Release all locks only at commit or abort time.
- Prevents Deadlocks
- Also Recoverable.
- Example:
Recovery
Computation Model:
Naive Recovery
- Force: on commit, (“force”) dirty flush to disk
- No Steal: don’t (“stealthily”) flush before commiting
Smarter: Undo/Redo Logging
- log start of transaction
- For each write/delete, log the old and new values
- On commit, write all logs to disk Properties
- Write-ahead logging: before disk flush, log must be flushed first
- No force: commit even without flushing
- Steal: flush to disk anytime (just log it!)
Fuzzy Checkpointing
- Log begin checkpointing and log open transactions
Start Checkpoint CHK1, Open transactions: S,T
- flush all current transactions (=transactions at “begin” point) at leisure
- Log finish checkpointing (and pointer for convenience)
- `Finish Checkpoint CHK1. Started at
- `Finish Checkpoint CHK1. Started at
Recovery
- Analyze & Redo Phase
- Find last completed checkpoint
Finished Checkpoint CHK1…
- Go to the start of that checkpoint
Start Checkpoint CHK1, Open: S,T…
- Analyze open transactions at time of crash
- ! Start with checkpoint’s open transactions
- When you encounter close transactions
Close S
, remove from - When you encounter new open transactions
Begin T
, add that to
- All operations between
Start Checkpoint…
and end of log is replayed
- Find last completed checkpoint
- Undo Phase
- From the last log item until whenever:
- For each open transaction at time of crash
- Undo all of its operations in reverse order…
- Until you reach
Begin T
- For each open transaction at time of crash
- Stop when is exhausted
- From the last log item until whenever:
Recoverability
For multiple transactions to be recoverable:
- T1 writes → T2 reads, then must T1 commits → T2 commits
- T1 writes → T2 writes doesn’t even matter
Ed discussion: