Last post Sep 18, 2012 05:22 PM by MattsDotNetUsername
Sep 18, 2012 03:21 PM|agent_smith|LINK
I am trying to create some design pattern to allow me to calculate metrics on a list of items without adding the properties to the item itself.
For example - I have a customer object and an underlying customer table in a database. I want to 'grade' the customer A-B-C-D depending on some factors such as sales volume and average purchase.
So I will have a 'Grade' property somewhere (not sure if this is part of the Customer class itself).
This seems straighforward for a single customer, because I can to various other data sources and get what I need, but what I want to create a list of customers and sort or filter them by 'grade' and this needs to be caclulated for each, I am concerned that
it will be slow.
I don't think 'Grade' really belongs as part of the Customer class, since it has to do more with the Customer's relation to other objects in the domain, and not the Customer itself.
Sep 18, 2012 03:26 PM|ignatandrei|LINK
calculate the grade once. store in some "database" , with the time calculated.
When reading, see what's the difference of now with time calculation. If more than x hours, recalculate.
How to determine x? DO not know... how often changes the "grades"?
Sep 18, 2012 03:36 PM|agent_smith|LINK
Thanks, but for a large, realtime system batch jobs and such need to be avoided.
Also the calculations could vary depending on users. What if I wanted to let the user relax or tighten the contstraints for grading the customer?
Sep 18, 2012 03:42 PM|ignatandrei|LINK
but for a large, realtime system batch jobs and such need to be avoided.
Realtime ?! are you sure "grades" are realtime ?
However, if so, the best is let database calculate and let dba admin do their jobs.
Sep 18, 2012 03:47 PM|agent_smith|LINK
If we have a large sales order system, and the 'Grade' is used to make descisions, then I need to know the current grade. I cannot wait until tomorrow for a batch process to run.
If a customer makes a large purchase and this affects the grade, it should be reflected immediately. Or if they bounce a check it should similarly be available immediately.
Honestly 'Grade' is an abstraction for what I need to do used to make this example easier to understand.
I have an entity that has many related things within the organization, and need to calculate many things to show some metrics. SQL batch processes are from the 1990's.
Sep 18, 2012 03:52 PM|ignatandrei|LINK
If a customer makes a large purchase and this affects the grade, it should be reflected immediately.
This can be calculated "on demand". Every time a customer makes some operations that affects the grade, launch async a stored proc that calculate the grade and stores it.
Sep 18, 2012 04:16 PM|MattsDotNetUsername|LINK
I would put the Grade in the Customer class. It is the grade of the customer which is a property of that customer so it makes sense to put it there regardless of how it is calculated.
Since you would like this to be real time, and each participant is calulcated individually, it really comes down to how many customers you need to calculate at once and how long each takes. If you need to calculate 100,000 Customers, and you can calculate
100 per second, it would take > 15 minutes to calculate them all.
So if comes down to your need:
Small amount of data: you can keep the caluclation in code and do it on the fly.
Moderate amount of data, and you can perform the calculations for "Grade" in SQL: create a View in your DB to calculate the Grade. SQL server should be able to perfrom this way faster than you could in code since it would not have the overhead of needing
to query each Customer seperately, open connections, convert datatypes, etc...
Large amount of data: batch job :( Recalculate it on a schedule (nightly, hourly, etc...)
Sep 18, 2012 04:43 PM|agent_smith|LINK
I think a good way to brainstorm a design pattern is to remove the concept of a SQL database.
If there were no SQL and batch jobs, how would this work?
Mentioning those things bypasses software design and uses another dependancy on technology (SQL) to overcome design shortcomings.
I know it is an easier way to say 'just run a SQL job to update some data' but that means you are then having to use two things (SQL and software objects) to answer the question instead of just one.
SQL is a storage medium only and should almost never be used for any business logic or processes.
Sep 18, 2012 05:17 PM|ignatandrei|LINK
Do not mention a Sql Database.
Think in terms of "data storage". You must "store" somewhere the results. And you must retrieve them. And you must calculate them - via some "batch" jobs .And some grades will be "updated" on demand( a customer with large command )
That will be all .
Sep 18, 2012 05:22 PM|MattsDotNetUsername|LINK
I don't think that the storage medium changes the issue. If you stored your data in flat files or a DB or wherever, it is still an expensive calculation. You have two choices, run it on the fly and it takes as long as it takes, or precalculate it.
Look at google, they don't figure out on the fly what web pages to show you based on your search string. It is all preaggregated. I wouldn't call that a design shortcoming on googles part. They are dealing with billions of webpages and want millisecond
response times. In a situation like that, there is no other way than to precalculate your data. They couldn't possibly search through the text of every webpage on the internet for each search string entered.
I do agree that you should keep your business logic out of the DB as much as possible. But sometimes it is better/faster/easier to do so. If you do not want to put business logic in your DB, and this is an expensive thing to calculate, you aren't left
with much choice. All that is left is some level of precalulcation. You could have a nightly process to calulcate the Grade and store it in the DB. For anyone that you select during the day that has recieved new data, you could just recalculate those customers
on the fly.