Last post Aug 19, 2016 07:58 AM by eralper
Aug 18, 2016 07:52 PM|gordon1221|LINK
I have a SQL Server database that I am utilizing in my .NET MVC application.
I need to loop through a table to get the count of each value in the column 'RevID'. Each value needs to end up in the table the same number of times (ok if they are a few off). So, if I have 100 records and value 'A' is in the table 30 times, value 'B'
is in the table '15' times, and value 'C' is in the table '40' times - they all need to be in the table 33 times (100 records / 3 different values).
I do not know how to approach this. I am thinking a while loop with an update statement. Would it be productive to do the updating in C# or stored procedure. Any ideas of how to set this up?
I was working with this but I know it is not correct:. Any input on how this should be addressed is GREATLY appreciated. I am NOT an SQL developer so please be nice ;)
DECLARE @i INT = 1,
@RevCnt INT = SELECT RevId, COUNT(RevId) FROM MyTable group by RevId
WHILE(@RevCnt >= 50)
SET RevID= (SELECT COUNT(RevID) FROM MyTable)
WHERE RevID < 50)
@i = @i + 1
Aug 19, 2016 07:58 AM|eralper|LINK
Your requirement is very interesting for me.
I have not worked on such a case, I will provide a solution for this. I'm not sure if it is the best one but the solution seems to be working quite well according to me.
I'll soon try to enhance the solution at
Update Table Data for Uniform Distribution in SQL of course if the uniform distribution is the correct name on this case
Please check below SQL Update CTE command
;with summary as (
COUNT(*) Over (Partition By 1) Cnt,
COUNT(RevId) Over (Partition By RevId) RevCnt
), todo as (
Cnt / (count(*) over (partition by 1)) Average,
), joint as (
orderno = ROW_NUMBER() over (partition by MyTable.RevId order by id),
RevCnt - Average as forupdate
inner join todo on MyTable.RevId = todo.RevId
), upd as (
where RevCnt < Average and orderno <= (forupdate * -1)
where RevCnt > Average and orderno <= forupdate
), final as (
case when forupdate > 0 then 1 else -1 end as overAvg,
rn = ROW_NUMBER() over (partition by (case when forupdate > 0 then 1 else -1 end) order by id)
RevId = ISNULL(n.RevId,p.RevId)
from MyTable t
inner join final p on p.id = t.id
left join final n on p.rn = n.rn and n.overAvg = -1
where p.overAvg = 1
I strongly suggest you to test the script on different types of data distribution before you use it.
multiple CTE expressions before update statement is formed. Also many SQL functions like
Row_Number() with Partition By clause and SQL Aggregate functions like
Count() with Partition By clause, etc.
I hope it helps you for solution,
I tested the above code block with different data sets, in some cases you need to execute the code twice. First execution minimizes the difference between counts of different categorized values, second will make them near to equal