SQL Transactions must be: ACID

  • Atomic: Done or not done. Atomic
  • Consistent: Don’t do partial writes
  • Isolated: Transaction Serializability:= transactions should seem like they are executed in isolation
  • Durable: Crashes should be recoverable

Isolation

Concepts

  • Serial Schedule: execute transactions in order. Don’t interleave anything.
  • Conflicting Operations: two transactions conflict if any of the operations they have on same data conflict iff one of the operations is a write operation
    • Dirty Write:
    • Dirty Read:
    • & Two transactions are is conflict-serializable iff they have conflicts, but we can interleave some operations but still seems like it’s a serial schedule. SEREALIZABLE guarantee above is equivalent to this.
  • Conflicting Transactions:
    • See Isolation (database systems) - Wikiwand
    • Non-repeatable Reads: reads writes and commits → reads again (different value!)
    • Phantom Reads: gets range → inserts between this range and commits → gets same range (different value!)
  • Precedence graph: a schedule with no cycles in the precedence graph (=graph is DAG) is conflict-serializable
    • A path in an acyclic precedence graph is a conflict-serializable schedule
    • Example:

Levels of Isolation Guarantees

  • Read Uncommitted: lock & release immediately
  • Read Committed: don’t release write locks until commit
  • Repeatable Reads: long duration locks
  • Serializable: long duration locks on ranges

Serializable Guarantee

Implemented thru Strict 2-phase locking (S2PL). Rules:

  1. One writer, multiple readers: Readers-Writer Locking
  2. 2-Phase Locking: For each transaction, you must lock everything first (=locking phase) then unlock everything together (=unlock phase):
  3. Strict Locking: Release write-locks (=exclusive-locks) only at commit or abort time
    1. Guaranteed recoverable (no cascading rollbacks)
    2. Example:enter image description here|200
  4. Rigorous (=Strong) Locking: Release all locks only at commit or abort time.
    1. Prevents Deadlocks
    2. Also Recoverable.
    3. Example:

Recovery

Computation Model:

Naive Recovery

  • Force: on commit, (“force”) dirty flush to disk
  • No Steal: don’t (“stealthily”) flush before committing

Smarter: Undo/Redo Logging

  1. log start of transaction
  2. For each write/delete, log the old and new values
  3. On commit, write all logs to disk Properties
  4. Write-ahead logging: before disk flush, log must be flushed first
  5. No force: commit even without flushing
  6. Steal: flush to disk anytime (just log it!)

Fuzzy Checkpointing

  • Log begin checkpointing and log open transactions
    • Start Checkpoint CHK1, Open transactions: S,T
  • flush all current transactions (=transactions at “begin” point) at leisure
  • Log finish checkpointing (and pointer for convenience)
    • `Finish Checkpoint CHK1. Started at

Recovery

  1. Analyze & Redo Phase
    1. Find last completed checkpoint Finished Checkpoint CHK1…
    2. Go to the start of that checkpoint Start Checkpoint CHK1, Open: S,T…
    3. Analyze open transactions at time of crash
      1. ! Start with checkpoint’s open transactions
      2. When you encounter close transactions Close S, remove from
      3. When you encounter new open transactions Begin T, add that to
    4. All operations between Start Checkpoint… and end of log is replayed
  2. Undo Phase
    1. From the last log item until whenever:
      1. For each open transaction at time of crash
        1. Undo all of its operations in reverse order…
        2. Until you reach Begin T
    2. Stop when is exhausted

Recoverability

For multiple transactions to be recoverable:

  1. T1 writes → T2 reads, then must T1 commits → T2 commits
  2. T1 writes → T2 writes doesn’t even matter

Ed discussion: