Saturday, November 13, 2010

Tridion Content Delivery and Caching

I recently had to assist a customer increasing the performance of their (Tridion-powered) site. The initial claim was that Tridion caching was not performing as expected and their database was being hit too many times.

I have used Tridion cache in all my projects, and especially when coupled with Dynamic Component Presentations I see drastic performance improvements. I don't remember ever having performance issues related to this, so I was a bit astonished as to the claim.

A quick overview of the code for this "static" site quickly revealed the reason why cache was not performing as desired - and I'll get to this in a second.

The Tridion content delivery cache uses a fairly complex dependency algorithm and something we call the "Cache Channel Service" to notify all delivery nodes when a given item has been published/republished/unpublished and therefore invalidate its cached version. When all is operating normally you will NEVER get an out-of-date page, even on a large (100+ servers) delivery environment. (You can read more details about the Cache Channel Service on this article at SDLTridionWorld).

However, what very often developers seem to not understand is what is cached, and what is a "dynamic" page in the context of Tridion. So let's do some definitions here.

Tridion Dynamic Page.
We call a page "Dynamic" not when it responds to a user's behavior, preferences or profile. We consider a page as dynamic if the system doesn't necessarily "know" what components will be displayed at the time the page was published. A good example of such a page is a "press release index" page, where the component presentations displayed are dynamically retrieved from the Content Broker at run time.

A page that includes ASCX controls to perform taxonomy lookups and determine which component presentations to display is not what we would call static - even if the page contains no logic to interact with the user's profile (which raises some more questions about why the taxonomy queries are being performed, but that's an issue for another day).

Tridion Object Cache
As the name implies, the Tridion object cache contains all the recently used Tridion objects loaded by your application - this typically includes Page Metadata, Component Metadata, Component Presentations, Dynamic Links, Keywords, Taxonomy objects, etc.

It allows us to minimize the number of times a given object is loaded from the content repository (file system or database) by storing a cached version of it - which gets invalidated only when 1) it is republished/unpublished or 2) the cache runs out of usable memory (configurable setting, by default set to 16 MB).

OK, so what does "object" cache mean?
It means that queries to the content broker (for instance "all press releases sorted by last published") will result on a set of objects being received by your application - and all those objects will probably be in memory already. So when you call ComponentPresentation.GetContent() you  will have an average wait time of 0 milliseconds or thereabouts.

However, your query results are not cached. In context of:
  • How cache invalidation occurs;
  • What we consider dynamic;
  • How often your content may be updated,
it makes sense that database query results don't get cached - otherwise any publishing action would invalidate all query result caching, and on a site that is updated thousands of times a day, this would result in more cache invalidation messages than anything else.

We could automatically make the results cached for a duration of, let's say, 5 minutes - but then you don't get the latest results, etc.

So Tridion (as we expect) took the "hands-off" approach of letting you control the cache through your application.

OK, so back at the issue at hand. I'm trying to ensure that the page I currently load, which may display up to 10 Component Presentations based on keywords attached to these components, improves its average load time from 1 second to as quick as possible. Since all these queries are going to the database, even if database cache is performing correctly, you will have to cope with additional db connections, network latency and what-not - multiplied by 10-12 queries per page, and here's a recipe for "slowness".

This site doesn't get updated very often, and has a very high number of average sessions, so I decided to test it with a 5 minute ASP.NET cache object for all my queries. Each individual query will be cached for a maximum of 5 minutes, which is an acceptable value for content editors/publishers.

What did I find?
Average load time went from 800 milliseconds to 17 milliseconds with an extremely simple change to my code.

For many reasons, I will not post the code as is currently being used at the customer, but here's the basics of it.
A wrapper method was created for Query (Tridion.ContentDelivery.DynamicContent.Query) which takes 2 parameters: the query object and a unique (per query) cache key. This cache key is nothing more than the string concatenation of the query parameters.

private String[] ExecuteTridionQuery(Query query, 
      String CacheKey)
  String[] results;
  if (Cache.Get(CacheKey) != null)
   results = (String[])Cache.Get(CacheKey);
  results = query.ExecuteQuery();
  Cache.Add(CacheKey, results, null, 
    System.Web.Caching.CacheItemPriority.Normal, null);
  return results;

And voilà, instant 60x boost to your "static" page.