Do more, be better

Hot cacheable nothingness

by John on 27 April 2011, filed under Wired in

I’ve been thinking about two things this week, but primarily about fastness. I’m referring to how we can make our code run faster in order to improve user experience, scalability and ensure that we get the most out of the resources that we have. While thinking about fastness I also ended up thinking about nothingness or nullity – i.e. things that we are looking for that don’t exist yet.

The reason that these null items are interesting is that we have a caching mechanism that usesĀ Memcached to store the results of operations and fetches that are costly. With a cache any item that is not found has to be looked up, or computed via the underlying slow mechanism. This is normally fine, however to perform this costly operation and find nothing leaves you with a decision to make around how to handle the null result.

One option is to do nothing and next time do the look up again. In most cases this is reasonable, however, we don’t want to be doing look-ups that are expensive and continuously find nothing. This has the potential to make code perform badly and can, in the worst case, result in a type of denial of service. What we have done is to store in the cache a special NULL class that marks that we have already tried the costly path, but nothing actually exists so we can give up looking any further. While this uses more memory it means that I can quickly return from the current task and not bother with the costly look-up.

All of this is handled transparently by our data layer so that items further up the stack never see this NULL class and simply get a plain old null returned. We have designed it as a write-through cache, so that when we end up creating a real object the NULL class is marked as not being in existence and will simply be replaced.

Caching on is not a silver bullet when improving application performance and introduces several failure modes that need to be addressed. These include when the cache goes away and you are starting it from scratch. It can end up isolating the underlying data storage mechanisms from the real world load that the application is handling and also hide errors in underlying systems. Some of these failures, particularly those where an empty cache has caused significant loading to occur and cause major outages, are not uncommon. In fact there have been many failures on some very large/busy sites where a problem with a cache has caused significant downtime.

Working out how to handle these cache related failure shenanigans is the next task (well, maybe after we finalize the API that is). I would be interested to see how and what other people are doing and thinking about when dealing with caching in general and how they handle failures.

Tags: ,

Something you want to say?

If you'd like to drop us a line about this blog post - or about anything at all - fill in the form below. This isn't a public comments board, so all correspondence will remain private (unless we both agree otherwise!)