Know your Database System!

[To put this post into perspective see DB Concurrency Control with .NET – Overview]

Developing database centric applications you must(!) understand the fundamentals of your specific database management system – even when using an ORM:

  • You must understand the DBMS architecture to be aware of the implications of using particular features.
  • You must understand the specific locking and concurrency control implementations because DBMS’s implement these features differently. Implementation differences leak into your application layers (even through ORMs) even with seemingly standardized features like isolation levels.
  • Do not treat the database as a black box – a persistence layer you need not understand. The database is the most critical piece of many applications.

If you do not have a good grasp of how your particular database implements concurrency control mechanisms, then you will:

  • Corrupt the integrity of your data.
  • Get “wrong” answers from your DB.
  • Have applications run slower than they should and could – regardless of the number of users.
  • Decrease your applications ability to scale to a large number of users.

Writing DBMS neutral applications is difficult:

  • I prefer exploiting the specific features of a DBMS.
  • Only write DB neutral apps when strong requirements demand it.
  • If writing DB neutral apps, test with all candidates from the beginning on.
    Consider using  RDBMS specific stored procedure layers.

Locks, Blocking and Deadlocks

All multiuser RDBMS’s use locking to control concurrency, consistency, and integrity. Locking serializes access: A lock on a resource causes concurrent operations to block when accessing the resource until the locking operation releases the lock. If the performance impact of blocking is too large, the lock usage must be tuned (ex: by shortening lock duration or choosing a different transaction isolation level). Deadlocks occur when concurrent operations each hold locks in a way that will never be resolved by waiting. Typical examples for deadlocks are:

  • Cycle Deadlock
    Competing sessions acquire locks on the same resources in different sequences. Can be avoided by always using the same sequence when accessing the data (ex: alphabetically by table name; top down or bottom up left to right through the relationship hierarchy).
  • Conversion deadlock (specific for SQL Server pessimistic mode)
    Competing sessions hold shared read locks on the same resource and try to promote these to exclusive locks for updating. Can be avoided using the (UPDLOCK) lock hint, by placing SELECTs on which no UPDATEs are based on outside the update TransactionScope (with Serializable only) or by using a lower isolation level.

    Using tra as new TransactionScope
        'Causes a conversion deadlock if executed concurrently
        ..."SELECT * from Inventories WHERE Type='tool'"
        'No conversion deadlock
        ..."SELECT * from Inventories (UPDLOCK) WHERE ID="42"
        ...
        ..."UPDATE Inventories SET Quantity=17 WHERE ID=42"
        tra.Complete
    End Using
    

Locks are used in both the pessimistic and optimistic concurrency models. With SQL Server locks do consume scarce resources and row locks may escalate to page or table locks. SQL Server escalates to balance concurrency and  lock overhead. In Oracle locks do not consume scarce resources (see [ExpOra] “What are locks”, “Lock Escalation”), locks are used on row modifications only, there are no read locks and locks are never escalated. Many recommendations to keep transactions as short as possible are only true in the context of SQL Server pessimistic mode. Deadlocks are rare using Oracle – in contrast to using SQL Server’s  default pessimistic mode (see [SQLInt]).

RDBMS Concurrency Models

DBMS Concurrency is the capability to allow multiple sessions (users, processes) to read or change the same data at the same time, and yet ensures that the data is consistent across all requests.

The pessimistic concurrency model handles concurrency by locking data that is being read, so no other sessions can modify that data. It also acquires locks on data being modified, so no other sessions can access that data for either reading or modifying.

The optimistic concurrency model handles concurrency by versioning rows to let readers see the state of the data before concurrent modifications started.

Oracle has always used its optimistic multi-versioning concurrency approach. It does not offer a pessimistic mode – which I don’t miss. Multi-versioning uses the information maintained in Oracle’s rollback segments to provide consistent views.

SQL Server’s default mode is pessimistic concurrency; optimistic row-versioning was implemented as an afterthought with SQL Server 2005. Row-versioning uses tempdb to store versions. Because SQL server was initially designed with pessimistic concurrency, it seems safe to assume its overhead for maintaining versioning to be higher than in Oracle (see [SQLInt] “Choosing a Concurrency Model”).

The following picture shows how the different database concurrency modes maintain statement-level read-consistency using Read Committed isolation. Note, how writers block readers in pessimistic (locking) mode.

Pessimistic Concurrency Mode

With SQL Server pessimistic concurrency mode (default):

  • Readers block writers.
  • Writers block readers.
  • Queries can block.
  • Queries are read-consistent
    on statement level (IsolationLevel=ReadCommitted)
    on transaction level (IsolationLevel >=Repeatable Read)

Oracle does not offer a pessimistic currency mode. And I do not miss it.

Optimistic Concurrency Mode

With Oracle optimistic multi-versioning:

  • Readers never block writers.
  • Writers never block readers.
  • Writers block writers only when locking.
  • Queries are non-blocking.
    Writers do not block queries.
  • Queries are read-consistent:
    on statement level (IsolationLevel=ReadCommitted)
    on transaction level (IsolationLevel=Serializable)

With SQL Server optimistic row-versioning:

  • Readers never block writers.
  • Writers never block readers.
  • Writers block writers.
  • Queries are non-blocking.
  • Queries are read-consistent:
    on statement level (IsolationLevel=ReadCommitted Snapshot)
    on transaction level (IsolationLevel=Snapshot)

Choosing pessimistic (locking) or optimistic (versioning) mode

The conservative approach would be to use SQL Server pessimistic mode and only switch to optimistic mode when one can verify that it really works better for a specific application.

On the other hand I strongly dislike the common problem with pessimistic mode where inserting an innocent-looking SELECT in some procedure may cause a conversion deadlock somewhere else. The internet forums are full of questions on how to use SELECT…(NOLOCK) hints with SQL Server.

Microsoft seems to be so confident about their optimistic mode, that SQL Azure (which is using SQL Server) uses optimistic mode only.

I my next SQL Server project I will try to use optimistic mode.

Transaction Isolation Levels

Higher isolation levels offer higher data consistency at the price of performance and scalability.

Oracle supports the following isolation levels:

  • Read Uncommitted (Oracle’s Read Committed is non-blocking)
  • Read Committed (default)
  • Repeatable read (Oracle’s Read Committed is read-consistent)
  • Serializable
  • Readonly

SQL Server supports the following isolation levels:

  • Read Uncommitted
  • Read Committed (default)
  • Read Committed Snapshot (alter database set read_committed_snapshot on)
  • Repeatable read
  • Snapshot
  • Serializable
  • Readonly

Some isolation levels defaults:

SQL Server Read Committed
Oracle Read Committed
SQL Azure Read Commited Snapshot
ADO.NET SQLTransaction Read Committed
ADO.NET OracleTransaction Read Committed
Enterprise Services Serializable
EF SQL Server  Provider Read Committed(*)
EF Oracle ODAC Provider Read Committed(*)
LightSwitch Query operations: Read Committed
Submit operations: Repeatable Read
TransactionScope Serializable(**)

*) EF automatically creates a new transaction around .SaveChanges() if it is not called in a transaction scope it can enlist to. The isolation level of this new transaction is the default isolation level of the DB provider – which is Read Committed for Oracle and SQL Server.

**) Strangely the default isolation level of TransactionScope is Serializable. Serializable is very expensive with SQL Server. According to [ExpOra] Oracle finds this method [serializable] scalable enough to run all of their TPC-Cs (an industry standard OLTP benchmark).

While it is  generally desirable to use Serializable, thus making transactions appear to run one at a time (~serially) rather than concurrently, real-world considerations usually require a compromise between perfect transaction isolation and performance. For most applications using Read Committed offers a good balance between consistency and concurrency  In addition it gives use a somehow (remember the leaking implementation differences!) consistent isolation level across tools, API’s and RDBMS’s.

The TransactionScope class provides an elegant way to enclose a set of DB operations (and more: see Using the .NET TransactionScope) in a transaction.

For early testing it is easy to change isolation levels:

  • Toggling between SQL Server Read Committed and Read Committed Snapshot requires no code changes.
  • Use a TransactionScope factory with a default isolation level.

RDBMS Optimistic Update Conflicts

Oracle (with Serializable only!) and SQL Server (with Snapshot isolation only!) throw an update conflict exception on commit when trying to update a row that has changed since the transaction began. These exceptions must be handled by the application, regardless of the conflict detection pattern the applications uses – even with “No Control”! (see DB Concurrency Control with .NET – Details).

  • ORA-08177 can’t serialize access for this transaction”.
    See [ExpOra],  “Transaction  Isolation Levels, Serializable””.
  • Msg 3960, Snapshot isolation transaction aborted due to update conflict.
    See [SQLInt], “Row   Versioning, , Update Conflicts”.

Stored Procedures

Using a stored procedures (SP) data access layer is a totally viable approach (see [ExpOra]). I personally dislike the language break and generally use stored procedure in special cases only. See Stored Procedures Pros & Cons.

However, many of the recommendation in my posts are valid for a stored procedure oriented approach too.

DB Concurrency Control with .NET – Details

Books & Articles

[ExpOra]
Expert Oracle Database Architecture: 9i and 10g Programming Techniques and Solutions,
Thomas Kyte

[SQLInt]
Microsoft SQL Server 2008 Internals (Pro – Developer),
Kalen Delaney

About Peter Meinl

IT Consultant
This entry was posted in Computers and Internet and tagged , , , , , , . Bookmark the permalink.

3 Responses to Know your Database System!

  1. womd says:

    yes, nice article we discussed about this just minutes bevore😉 … will take a look at your library. Thank you , cu

  2. Pingback: DB Concurrency Control with .NET – Overview « DuongTienLam's Blog

  3. Pingback: DB Concurrency Control with .NET – Details « DuongTienLam's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s