4. Thinking outside the box - Maintaining multiple versions to speed up concurrency in a system
Two processes are said to be concurrent when they perform operations at the same time. Locking performs great when the workload is heavily write oriented whereas it has an overhead in case of a read-only workload. Optimistic methods, on the other hand, perform brilliantly when there are less read-write conflicts.
But what about a mixed workload where reads and writes happen simultaneously? Can we optimize for this case?
It turns out that we can maintain multiple versions (ordered by timestamp) of the same data item and transactions have a “visibility set” when they operate. A transaction can only see the value of the item which was modified before the transaction’s timestamp. This way, a write operation will add a new version while a read operation will search the list to find the appropriate version to read.
While the approach adds an overhead of maintaining such lists, it is an amazing approach where readers don’t block writers and writers don’t block readers! However, write-write conflicts still need to be resolved.
This method is used in the industry by major database systems like PostgreSQL, Amazon DynamoDB, Google’s BigTable, etc.