Database Concurrency

What Is Database Concurrency?

Ever wondered how online platforms such as ticketing and shopping websites handle millions of simultaneous transactions without crashing or mixing up vital details? The answer lies in database concurrency.

Simply put, it is the ability of a database to handle multiple simultaneous read/write interactions from users or applications without impacting the integrity or the quality of the data stored within it.

Why Is Database Concurrency Essential?

Simultaneous interactions with a database often lead to the following problems:

Dirty Read

A dirty read occurs when one transaction reads an uncommitted or temporary value written into the database by another transaction that may be later aborted or rolled back.

For example, assume a joint bank account with your spouse, and the following events occur.

Time Action Account Balance
T1 Initial account balance 1000
T2 Partner Deposits 1000 (Transaction A) 2000
T3 Purchase something worth 1500 (Transaction B) 500
T4 Deposit transaction rolls back (-1000) -500

As you can see, the account balance becomes negative after transaction B commits because it reads an uncommitted value from transaction A, which was later rolled back.

Lost Data Update

A lost data update occurs when two transactions update the same data item, and one of them overwrites the changes made by the other without knowing it.

For example, assume you have two employees trying to update the product details of a specific product on your online store, and the following occurs.

Time Employee A Employee A Product Price
T1 Reads the product Price 100
T2 Increases price by 10% 110
T3 Reads the product Price @ 110 110
T4 Commits 110
T5 Decreases the product price by 5% 104.5
T5 Commits 104.5

Here, the result is not what either employee wanted it to be.

Phantom Read

A phantom read occurs when one transaction reads a set of data items that satisfy some condition and finds that another transaction has inserted or deleted some data items that also satisfy the same condition in between.

For example, say you have been tasked to determine the total number of products that have been rated 4 or higher by customers. Consider the following sequence of events.

Time Action Number of Products with Rating >= 4
T1 Run the first search query 10
T2 Forward the information 10
T3 Another customer rates a new product 4 or higher 11
T4 Run the same search query again 11

Unrepeatable Read

An unrepeatable read occurs when one transaction reads the same data item twice and finds different values each time because another transaction has updated the data item in between.

Time Transaction A Transaction B Name on Record
T1 Read name (John) John
T2 Update name (Jack) Jack
T3 Commit Jack
T4 Read name (Jack) Jack

Here, transaction A has read two different values for the same data item, which can cause confusion or errors when used for subsequent data operations.

So, how do you overcome these problems? The answer is Database Concurrency Control techniques. Let’s look at a few of these techniques in detail.

Database Concurrency Control Techniques

The following are the most effective database concurrency control techniques that help overcome the problems mentioned above.

Locking

Locking is a technique that prevents concurrent transactions from accessing or altering the same data item with the use of locks (read or write). These locks can be exclusive or shared and can be implemented on various levels of granularity (records, pages, tables, or databases), ensuring database serializability.

Time Stamping

Time stamping is a database concurrency control technique that assigns a time stamp to each transaction. These timestamps are used to order concurrent transactions and determine which transaction “wins” in case of conflicts. Older timestamps typically have precedence, ensuring a predictable execution order.

Multi-versioning

Multiversioning is a database technique that keeps multiple versions of each data item with a timestamp or version number. It allows concurrent transactions to access different versions of the same data item but requires more database storage space.

Validation

Validation is a technique that divides each transaction into three phases:

  • Read phase: The transaction reads the data items from the database and stores them in a local buffer.
  • Validation phase: The transaction checks whether it has any conflicts with other transactions that have already been committed.
  • Write phase: The transaction writes the updated data items back to the database if it passes the validation.

Validation allows concurrent transactions to read the same data item without interference.