Entity Framework, TransactionScope and MSDTC

January 01, 2010 2:00 AM by Daniel Chambers (last modified on June 16, 2011 2:05 PM)

Update: Please note that the behaviour described in this article only occurs when using SQL Server 2005. SQL Server 2008 (and .NET 3.5+) can handle multiple connections within a transaction without requiring MSDTC promotion.

I've been tightening up code on a website I'm writing for work, and as such I've been improving the transactional integrity of some of the code that talks to our database (written using Entity Framework). Namely, I've been using TransactionScope to create transactions at specific isolation levels to ensure that no weird concurrency issues can slip in.

TransactionScope is very powerful. It has the ability to maintain your transaction across two (or more) database connections (or at least the SQL Server database code that uses it does) or even after you close the connection to the database. This is done by promoting your transaction up to being a "distributed transaction" that is managed by the Microsoft Distributed Transaction Coordinator (MSDTC) when you start to use multiple connections, or close the connection that you currently have a transaction in. So, essentially, as soon as your transaction becomes something that SQL Server can't handle with its normal transaction, the transaction is palmed off to MSDTC to manage.

An MSDTC transaction comes at a performance price as the transaction is no longer a lightweight transaction managed by SQL Server internally, but a heavyweight MSDTC transaction that is much more powerful. So, if at all possible, we do not want to use MSDTC (unless, of course, we actually need it).

However, there is a funny (or not so funny, when you think about it) behaviour that Entity Framework exhibits that causes your innocent transaction to be promoted to being an MSDTC transaction. Consider this code:

using (TestDBEntitiesContext context = new TestDBEntitiesContext())
{
    using (TransactionScope transaction = new TransactionScope())
    {
        var authors = (from author in context.Authors
                       select author).ToList();

        int count = (from author in context.Authors
                     select author).Count();

        transaction.Complete();
    }
}

This looks like a pretty innocent bit of code and looks like it should not result in transaction promotion. By putting the lifetime of the transaction (its using block) inside the lifetime of the ObjectContext, we ensure that the transaction cannot outlive any connection used by the context and therefore cause a promotion to an MSDTC transaction.

However, this code causes a transaction promotion on line 8. You can see this by ensuring the MSDTC service ("Distributed Transaction Coordinator") is stopped and then waiting for the exception that will be thrown because the SqlConnection is unable to promote the transaction since MSDTC is not running.

Why does this occur? A bit of digging on MSDN comes up with this bit of innocuous documentation:

Promotion of a transaction to a DTC may occur when a connection is closed and reopened within a single transaction. Because Object Services opens and closes the connection automatically, you should consider manually opening and closing the connection to avoid transaction promotion.

By sticking some breakpoints in the code above, we can observe this behaviour in action. The connection (ObjectContext.Connection) is closed by default, is opened quickly for the first query then closed immediately, then opened again for the second query, then closed. This second connection that is opened causes the transaction promotion!

At first glance this seemed to me to be an inefficient way of handling the connection! It's not that uncommon that one would want to do more than one thing with the ObjectContext in sequence and having a connection opened and closed for each query seems really inefficient.

However, upon further thought, I realised the reason why the Entity Framework team does this is probably to cover the use case where you have a long-lived ObjectContext (unlike here where we create it quickly, use it, and then throw it away). If the ObjectContext is going to be around for a long time (perhaps you've got one hanging around supplying data to a WPF dialog) we don't want it to hog a connection for all that time (99% of the time it will be idling waiting for the user!).

However, this "feature" gets in our way when using the ObjectContext in the manner above. To change this behaviour you need to sack the ObjectContext from the connection management job and do it yourself:

using (TestDBEntitiesContext context = new TestDBEntitiesContext())
{
    using (TransactionScope transaction = new TransactionScope())
    {
        context.Connection.Open();

        var authors = (from author in context.Authors
                       select author).ToList();

        int count = (from author in context.Authors
                     select author).Count();

        transaction.Complete();
    }
}

Notice on line 5 above we are now manually opening the connection ourselves. This ensures that the connection will be open for the duration of both our queries and will be closed when the ObjectContext goes out of scope at the end of its using block.

Using this technique we can avoid the accidental promotion of our lightweight database transaction to an heavy MSDTC transaction and thereby scrape back some lost performance.

Comments [2] | Permalink

Categories: Software Development

Tags: C#, Entity Framework, MSDTC, Transactions

Comments (2)

Submit Comment | Comments RSS Feed

sib

May 19, 2011 9:11 AM

Permalink

Thanks a lot! You saved my day. The main reason we do not want to use MS DTC is because it can be turned off on client servers or can lack network permissions and deployment becomes nightmare!

Unknown (myopenid)

January 24, 2012 3:07 PM

Permalink

Hi, I am writing a serial transaction and had some troubles.
If I only execute one thread, it executes fast, no waiting.
If I execute two or more threads, the first thread that complete takes too much time but subsequent threads are fast, why?

If I run two process concurrently of this program, each process's first thread takes too much time to complete but subsequent are fast, is this normal?

I am using sql express 2008

tanks.

class Program
{
int numero;

Program(int numero)
{
this.numero = numero;
}

static void Main(string[] args)
{
Console.WriteLine("Puede presionar una tecla para salir en cualquier momento");

for (int i = 0; i < 10; i++)
{
Program p = new Program(i + 1);
Thread t = new Thread(p.Ejecutar);
t.Start();
}

Console.ReadKey();
}

public void Ejecutar()
{
bool esPosibleProseguir = true;
while (esPosibleProseguir)
{
Database1Entities cx = new Database1Entities();
TransactionOptions opts = new TransactionOptions();
opts.IsolationLevel = System.Transactions.IsolationLevel.Serializable;
System.Transactions.TransactionScope scope = new System.Transactions.TransactionScope(TransactionScopeOption.Required, opts);

try
{
//var mayor = cx.Numeracion.First();
//int nuevo = mayor.Numero + 1;
cx.Connection.Open();

int nuevo = cx.VistaNumero.First().Actual;

var per = new Persona();
per.Nombre = string.Format("Persona-{0}", nuevo);
per.Ci = (1000 + nuevo).ToString();
per.Numero = nuevo;
cx.Persona.AddObject(per);

//mayor.Numero = nuevo;

cx.SaveChanges();
scope.Complete();

Console.WriteLine("Hilo: {0}, Crea persona: {1}", this.numero, nuevo);
break;
}
catch (UpdateException ex1)
{
esPosibleProseguir = true;
}
catch (TransactionAbortedException ex2)
{
esPosibleProseguir = true;
}
catch (Exception ex)
{
Console.WriteLine("Hilo: {0}, Excepción: {1}, Mensaje: {2}", this.numero, ex.GetType().Name, ex.Message);
esPosibleProseguir = false;
}
finally
{
DisponerObjeto(scope);
DisponerObjeto(cx);
}
}
}

private void DisponerObjeto(IDisposable obj)
{
try
{
obj.Dispose();
}
catch (Exception ex)
{
Console.WriteLine("Hilo: {0}, Excepción: {1}, Mensaje: {2}", this.numero, ex.GetType().Name, ex.Message);
}
}
}