Bill Brown bio photo

Bill Brown

A complicated man.

Twitter Github

I have to agree with Dare Obasanjo's latest blog entry about in-memory caching. After working on high-transaction, heavy database-using Web applications for the last nine years, there is one thing above all else that I have learned and taken to heart: a Web application is only as good as its caching strategy. My career has seen a progression from light to heavy cache usage and each new application has benefitted in scalability from that.

Dare's entry got me thinking: why couldn't the RDBMS itself incorporate a distributed, in-memory cache like memcached or Project Velocity? What if a Web application could basically eliminate the need for its own caching layer by relying solely on the database, which would then aggressively and algorithmically use one of the caching services to expand its memory-based caching?

If the problem with query caching in MySQL or SQL Server is the amount of server RAM that can be installed, then distributed caching seems like the perfect solution. It's what the Web server layer uses: why not bring it down to the data layer. Moreover, given the common replication and clustering scenarios, there are likely idle database servers whose memory is already going unused for the most part. Putting a distributed caching system in place would put them in action while still keeping them ready for failovers.

The main objections I can see is that going to the database might cause an increase in network usage since some cache calls in the Web server layer would never leave the server and that the database would have to work to decide between file-level and cache-level access. But that would be minimal and the simplification it would engender on the Web application level would make the costs even less objectionable.

It's entirely possible that Project Velocity is being undertaken with exactly this thought in mind. (It's not clear that there's any movement afoot in MySQL AB towards this end—at least from my cursory searches.) This idea would have to be implemented at the RDBMS level.