[英]T-SQL Unequal Decile based on totals in another column
I didn't see an exact answer for what I was looking for. 我没有找到确切的答案。 I have a table with an ID and two values.
我有一个带有ID和两个值的表。 I need to sort the first value column low to high and then decile the list based on each decile having an equal (or almost equal) total value 2. Here's an example using quartiles for space considerations:
我需要将第一个值列从低到高排序,然后基于具有相等(或几乎相等)总值2的每个十分位来决定列表。以下是出于空间考虑而使用四分位的示例:
I have: 我有:
ID value1 value2
1 2 132
2 6 182
3 5 195
4 8 152
5 3 132
6 9 129
7 3 180
8 9 120
9 3 172
10 6 192
11 9 177
12 12 151
Each quartile should about about 478.5 每个四分位数应约为478.5
Sorting by value1 gets this but I need to be able to assign my quartile where each is about 478.5. 按value1排序可以得到这个值,但是我需要能够分配我的四分位数,每个四分位数约为478.5。 I have manually entered sample quartiles which may or may not be correct based on the calculations
我已经手动输入了样本四分位数,根据计算结果可能正确也可能不正确
ID value1 value2 Qtle
1 2 132 1
5 3 132 1
7 3 180 1
9 3 172 2
3 5 195 2
2 6 182 3
10 6 192 3
4 8 152 3
6 9 129 4
8 9 120 4
11 9 177 4
12 12 151 4
Sorry about the formatting - first post. 抱歉,格式化-第一篇文章。
Edit 1 - I think I might have solved it, although it's probably not as elegant as it could be 编辑1-我想我可能已经解决了,尽管它可能不像它可能的那么优雅
Edit 2 - Added sample quartiles above and fixed the code below to reflect quartiles instead of deciles. 编辑2-在上方添加了样本四分位数,并修复了以下代码以反映四分位数而不是十进制。 Also fixed the sum of value2
还固定了值的总和2
SELECT value1
,value2
,SUM(value2) OVER (ORDER BY value1 ) CumSum
,CASE
WHEN SUM(value2) OVER (ORDER BY value1 ) < (Select sum(value2) from table1)/4 Then 1
WHEN SUM(value2) OVER (ORDER BY value1 ) < 2 * (Select sum(value2) from
table1)/4 Then 2
WHEN SUM(value2) OVER (ORDER BY value1 ) < 3 * (Select sum(value2) from
table1)/4 Then 3
Else 4
End as Quartile
FROM Table1
I hope I've got this correctly... 我希望我已经正确了...
The following is a generic approach. 以下是通用方法。 You can specify the
@TileCount
with a variable: 您可以使用变量指定
@TileCount
:
DECLARE @Table1 TABLE(ID INT,value1 INT,value2 INT);
INSERT INTO @Table1 VALUES
(1,2,132)
,(2,6,182)
,(3,5,195)
,(4,8,152)
,(5,3,132)
,(6,9,129)
,(7,3,180)
,(8,9,120)
,(9,3,172)
,(10,6,192)
,(11,9,177)
,(12,12,151);
DECLARE @TileCount INT=4;
WITH Sums AS
(
SELECT TOP (@TileCount) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS TileRank
,A.SumTotal
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) * (A.SumTotal / CAST(@TileCount AS FLOAT)) AS SumPart
FROM master..spt_values
CROSS APPLY(SELECT (SELECT SUM(value2) FROM @Table1) AS SumTotal)AS A
)
,AddCumSum AS
(
SELECT value1
,value2
,SUM(value2) OVER (ORDER BY value1) CumSum
FROM @Table1
)
SELECT AddCumSum.*
,A.SumPart
,A.TileRank AS Tile
FROM AddCumSum
OUTER APPLY(SELECT TOP 1 * FROM Sums WHERE CumSum<=SumPart ORDER BY TileRank ASC) AS A;
The result 结果
+--------+--------+--------+---------+------+
| value1 | value2 | CumSum | SumPart | Tile |
+--------+--------+--------+---------+------+
| 2 | 132 | 132 | 478,5 | 1 |
+--------+--------+--------+---------+------+
| 3 | 132 | 616 | 957 | 2 |
+--------+--------+--------+---------+------+
| 3 | 180 | 616 | 957 | 2 |
+--------+--------+--------+---------+------+
| 3 | 172 | 616 | 957 | 2 |
+--------+--------+--------+---------+------+
| 5 | 195 | 811 | 957 | 2 |
+--------+--------+--------+---------+------+
| 6 | 182 | 1185 | 1435,5 | 3 |
+--------+--------+--------+---------+------+
| 6 | 192 | 1185 | 1435,5 | 3 |
+--------+--------+--------+---------+------+
| 8 | 152 | 1337 | 1435,5 | 3 |
+--------+--------+--------+---------+------+
| 9 | 120 | 1763 | 1914 | 4 |
+--------+--------+--------+---------+------+
| 9 | 129 | 1763 | 1914 | 4 |
+--------+--------+--------+---------+------+
| 9 | 177 | 1763 | 1914 | 4 |
+--------+--------+--------+---------+------+
| 12 | 151 | 1914 | 1914 | 4 |
+--------+--------+--------+---------+------+
The CTE Sums
computes some values which allows to use them as named variables . CTE
Sums
计算一些值,这些值允许将它们用作命名变量 。 The @TileCount
is used within the TOP
clause in connection with ROW_NUMBER()
selecting from master..spt_values
. @TileCount
在TOP
子句中使用,与ROW_NUMBER()
从master..spt_values
选择一起使用。 This is nothing else than a well filled table. 这不过是一张装满桌子的桌子。 We are not interested in the values, we just need it as the base to get a running number.
我们对这些值不感兴趣,我们只需要它作为获取运行编号的基础。
The second CTE AddCumSum
returns the result with the running summa. 第二个CTE
AddCumSum
返回具有运行AddCumSum
的结果。
The final SELECT
finds the smallest TileRank fitting to the running summa. 最终的
SELECT
查找运行摘要的最小TileRank拟合。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.