Last post Feb 23, 2011 04:12 AM by Dave Sussman
Feb 22, 2011 07:05 AM|Haegendoorn|LINK
I am developing a website that contains 2 kinds of objects that will be accessed almost on every page request. (Products & Categories)
The category listing will be showed on every page, and rendered according to the specific page. I could use some simple user control with output or fragment caching, but I also have a url rewriting module that needs access to the category listing. The category
listing is fairly static. So i guess that caching the category listing is recommended.
The products on the other hand have properties that are frequently updated (product availability), and some that are static (Price, Description,Shipping Properties,...). To get the products out of the DB is an expensive operation because there is a m2m relationship
between products and categories which makes things more complicated.
What kind of caching strategy do you recommend? I know this is quite abstract and will require review when in production.
# Create a class for the static product properties and cache these, retrieve the changing properties every time from the DB?
# Because the products will be accessed mostly starting from a category, map all products to the belonging categories at startup?
# Should i invalidate the entire cache every time the data changes, or should i also try to update the cache?
# What do you advise regarding caching the product-category mapping?
I read some articles regarding this subject, but most are to general and don't seem real live at all, or end with: 'Caching can improve or ruin your performance...'.
Any advise is appreciated
Feb 22, 2011 12:04 PM|nilsan|LINK
Feb 22, 2011 12:39 PM|Haegendoorn|LINK
Thanks for the links
I already read the donut caching article. But is this approach recommended?
This method will cache the page output, but this leaves me with different copies
for each type of browser. And what about sorting, filtering,...?
Which one do you recommend?
Full caching with cache updates (maintaining a cache and db version)
vs Partial cache with frequent DB operations
Feb 23, 2011 04:12 AM|Dave Sussman|LINK
I'd cache the raw data, certainly for categories. Use the ASP.NET Cache (see http://msdn.microsoft.com/en-us/library/aa478965.aspx for a good example of caching stuff, including the API). Since you can time the cache, you could easily cache for, say, 24
hours for the categories; effectibvely you'd only hit the databsae once every 24 hours (or less if you use a sliding cache). The advantage of caching just the data is it allows the cache to be used by all parts of the applicating (pages & rewriting); I do
a similar thing in one of my applications.
For products, you could cache in a similar way, but just use a smaller time slice; there's always a danger though, of stale data if you have no invalidation policy. You can use SQL Server cache invalidation, although this has drawbacks too. This will keep
a permenant connection (just one) open to the database, and will use that to notify your application of changes to data; it can be fine grained (ie, cache only gets invalidated if requested data changes) so means you get the best of both cached and up-to-date
Personally, I'd consider how often the product data changes, and use the simpler data cache API, just for smaller periods of time. The real problem you have is that for a new site you have no metrics, no large number of users, with which to test how caching
can improve your application. My plan of attack would be:
1. Centralise all data access
2. Build application
3. Test under load, using VS load test tools, to give you baseline numbers
4. Add simple data caching
5. Re-test; you now have numbers to compare against the baseline