Low-Lock Singletons In D

DConf 2013 was a huge success!  I’ve finally recovered enough from my red-eye flight back to the East Coast to blog about a topic that really stuck out in my mind from it.

At DConf, I gave a talk on D-specific design patterns.  (The video will be up on dconf.org shortly.)  One pattern in particular attracted much more audience attention than I thought it would, so I decided to blog about it:  The Low-Lock Singleton Pattern.

Most people who program for a living are probably familiar with the traditional Singleton Pattern, which is used to create a single, globally available instance of a class.  A canonical implementation in D is shown below:

class MySingleton {
  static MySingleton get() {
    if (instance_ is null) {
      instance_ = new MySingleton;
    }
    return instance_;
  }
 private:
  this() {}
  __gshared MySingleton instance_;
 }

Most of these people also probably know that this naive implementation is a concurrency nightmare.  You can’t safely call get() from multiple threads simultaneously.  The most obvious reason is because two instances of MySingleton will get created if get() is called from a second thread before it’s finished running in the first thread that called it, i.e. before instance_ !is null from the first call.

If you want thread safety, the most obvious solution is to wrap everything in a mutex, represented in D by a synchronized block.  Omitting all the boilerplate except the get() method, this would look something like:

  static MySingleton get() {
    synchronized {
      if (instance_ is null) {
        instance_ = new MySingleton;
      }
    }
    return instance_;
  }

Now, if Thread 1 calls get(), it has a chance to finish instantiating instance_ and storing a reference to it before Thread 2 checks whether instance_ is null.  There’s only one problem with this:  It comes at a huge performance cost (benchmarks forthcoming).  Entering a synchronized block is expensive.

One solution that looks correct to the naive eye but is subtly broken is double-checked locking.  The idea here is to check whether instance_ is null outside synchronized block, and then if so, check again inside one:

  static MySingleton get() {
    if (instance_ is null) {
      synchronized {
        if (instance_ is null) {
          instance_ = new MySingleton;
        }
      }
    }
    return instance_;
  }

Intuitively this seems reasonable.  instance_ can’t go from non-null to null, only from null to non-null.  Therefore, if it’s already non-null by the time we get to the first check (the one outside the synchronized block) it would seem we know every thing we need to know and can simply return instance_.  Alas, the real world is not that simple.  The details are complicated and best explained by others, but the executive summary is this:  The sequentiality of program execution is an illusion maintained within a single thread.  Compilers and CPUs reorder instructions and the illusion breaks down when dealing with multiple threads.  For example, instance_ can be assigned a non-null value before the object it refers to is fully constructed.

In D, though, thread-local storage (TLS) is a first class member of the language.  This leads to an interesting workaround that’s similar in spirit to double-checked locking but actually works.  It’s a previously obscure pattern that was independently invented by at least myself and Alexander Terekhov, an IBM researcher working in Java.   It never caught on for some reason, probably because of the abysmal performance of TLS in Java back when it was discovered, as illustrated by Doug Lea’s benchmarks from that era.  Anyhow, the pattern is implemented in D below:

class MySingleton {
  static MySingleton get() {
    if (!instantiated_) {
      synchronized {
        if (instance_ is null) {
          instance_ = new MySingleton;
        }
        instantiated_ = true;
      }
    }
    return instance_;
  }
 private:
  this() {}
  static bool instantiated_;  // Thread local
  __gshared MySingleton instance_;
 }

The idea is simple:  We maintain a thread-local flag to track whether we’ve already entered the synchronized block from the current thread and ensured that instance_ is instantiated.  The synchronized block is entered once per thread, amortized over a whole run of the program.

Benchmarks

Ideally, we’d like thread-safe Singletons that are as fast as thread-unsafe Singletons.  How well does our low-lock TLS-based Singleton perform in practice?  I created a micro-benchmark calling get() in a loop 1 billion times and ran it for naive, thread-unsafe Singletons (unsafe), Singletons that enter a synchronized block on every call to get() (synchronized), and the low-lock TLS-based Singletons (TLS).  The results are shown in the graph below:

chart_1

On GDC the overhead of TLS vs. unsafe is virtually undetectable.  On DMD it’s noticeable but small.  In both cases it pales in comparison to the overhead of synchronizing on every call to get().  Note that this benchmark was run in a single thread, so the mutex used to implement the synchronized statement was never contested.  If it were, the results when synchronizing on every call to get() would be far worse.

There you have it:  An obscure Java pattern that was ahead of its time, resurrected in the presence of more modern thread-local storage implementations in D, lets us have our cake and eat it too:  Fast Singletons that are thread safe without any dependence on obscure memory model details.

About these ads
This entry was posted in Uncategorized. Bookmark the permalink.

5 Responses to Low-Lock Singletons In D

  1. exim says:

    Thanks for the interesting article, but I find the usage of the word “invention” bit too much – I mean, you just straightforwardly need/use a `thing` with an atomic read – and the language (D in this case) provides it out of the box.

  2. MIchael says:

    Not a D programmer but doesn’t this work?
    class MySingleton {
    MySingleton get() { return instance_; }
    private:
    this() {}
    __gshared MySingleton instance_ = new MySingleton;
    }

    It should create a new instance once and only on the first time the class is referenced.

    • This reminds me of the best way to do this in Java:

      public class Singleton {
      private static Singleton singleInstance = new Singleton();
      private Singleton() {}
      public static Singleton getSingleInstance() {
      return singleInstance;
      }
      }

      Which only gets initialized once, in a thread safe way, the first time the class is accessed.

    • Yatao Li says:

      Maybe it’s because D is unmanaged and there’s no thread-safe static constructor stub that blocks others from accessing it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s