Last post Sep 17, 2011 08:39 AM by Novo1
Sep 17, 2011 03:12 AM|Novo1|LINK
We have an ASP.NET 4.0 application that draws from a database a complex data structure that takes over 12 hours to push into an in memory data structure (that is later stored in HttpRuntime.Cache). The size of the data structure is quickly increasing and
we can't continue waiting 12+ hours to get it into memory if the application restarts. This is a major issue if you want to change the web.config or any code in the web application that causes a restart - it means a long wait before the application can be
used, and hinders development or updating the deployment.
The data structure MUST be in memory to work at a speed that makes the website usable. In memory databases such as memcache or Redis are slow in comparison to HttpRuntime.Cache, and would not work in our situation (in memory db's have to serialize put/get,
plus they can't reference each other, they use keys which are lookups - degrading performance, plus with billions of keys the performance goes down quickly). Performance is a must here.
What we would like to do is quickly dump the HttpRuntime.Cache to disk before the application ends (on a restart), and be able to load it back immediately when the application starts again (hopefully within minutes instead of 12+ hours or days).
The in-memory structure is around 50GB.
What do you guys think?
Sep 17, 2011 03:57 AM|XIII|LINK
What do you guys think?
That storing 50Gb+ in memory sounds like a bad idea or is at least way too much. I'm sure that if you look into matters more deeply that you can shave off quite a lot of data out of it or put it in seperate Cache.
Sep 17, 2011 04:15 AM|Novo1|LINK
Yes it does sound like a lot and it is. If we had a choice we would not store it in the cache if there was a good alternative speed and cost wise. The issue is if we don't have it all in memory and do a lazy read which reads from a db or other data source
as required (and kicks out data when not used for a period of time) the performance degrades significantly when you are talking about 15+ accesses to this structure in perhaps one call. Having it in memory makes the random calls to this structure very fast.
The secondary cache you are talking about (that we have implemented) is our other in-memory database systems that reside on other servers. The latency to them is minimal, but accessing them is not as fast, so calling backwards and forwards between these
servers in double digit times is not really optimal per request. We have data split between the HttpRuntime.Cache (fast), in-memory database (medium), database (slow) depending on the access requirement. And to explain a little more, we can't use the in-memory
database solution for all our caching requirements because of the massive amount of keys (in-memory database systems are key/value so the data has to normally be denormalized).
We do not want to use the HttpRuntime.Cache, but have not been able to find another good alternative in .Net for our needs.
Sep 17, 2011 04:24 AM|Novo1|LINK
The cache some how needs to persist on restarts because it takes to long to regenerate. Is there another alternative?
Sep 17, 2011 04:58 AM|XIII|LINK
I haven't really played with it but Microsoft provides Windows Server Appfabric caching. You might want to check that out.
I'm pretty interested in your architecture and why you need so much, seems like your entire database, in Cache. Is it because you have thousands of concurrent users with only very limited hardware?
Sep 17, 2011 05:22 AM|Novo1|LINK
It would be wonderful if it was out entire database, but it's not. 50GB is only a small part of our database that is many many TB (terabytes) in size. 50GB for us a minimum amount we need in order to accelerate many of the critical features of the site.
The problem is this amount is continually increasing every day, and reloading it into the memory, as I wrote... could take nearly a day, or days when the data size continues to increase. Trying to restart the app, or upload new code could put the site down
for a day if on the same server (we are having to restart on a mirror then when it's finished loading swap over...but then you still have data that is out of sync for that day).
AppFabric is a solution for caching we chose against because of it's limited feature set (at this time) compared to other in-memory database systems. In my previous post, I talked about making 15+ random calls to the Runtime cache that happen almost instantly.
If we were using AppFabric that latency on the total call time would too long to support a system that needs to support many users concurrently, and have almost instant response times.
Could there a way of copying the memory used by the cache, putting it on disk, and restoring it without the complexity and time put ontop of the process by all the serializers in .net (essentially ignoring all the type information on objects and just copying
Sep 17, 2011 08:33 AM|despos|LINK
This is a nice problem to address ... Hearing that Memcached doesnt work for you, well, is not usual but I assume you spent enough time on it to reach that conclusion. So to come to your specific question, yes, I believe you can intercept and app-restart
event and persist the cache. You'll reload it from disk in global.asax. Intercepting the app-restart event requires a bit of a hack but if memory serves me well ScottGu demonstrated it once. So it should be OK, though I haven't tried that in ASP.NET 4.
Sep 17, 2011 08:39 AM|Novo1|LINK
We have tried what you described with trying to to persist the cache in the Application_End event in Global.asax, but by then it has been completely cleared and you cannot persist it. We have also tried adding a OnRemovedFromCache handler to HttpRuntime.Cache,
but that is not being fired in .net 4.0, so is it a bug? We don't know.
The main problem is persisting 50GB to disk quickly, then being able to recover it. The easy part is when to do it, the hard part is how to do it?