[英]data partitioning by criterion in SQL
How can I split the data by the criterion? 如何按标准分割数据?
SELECT [Dt]
, [CustomerName]
, [ItemRelation]
, [SaleCount]
, [DocumentNum]
, [DocumentYear]
, [IsPromo]
, [CustomerType]
FROM [Action]. [Dbo]. [FC]
[IsPromo] has the values 0 and 1. I need to divide the data by the number of sales by [SaleCount] for the zero category ispromo! [IsPromo]的值为0和1。对于零类别ispromo,我需要将数据除以[SaleCount]的销售数量!
For example, i have 20 observations by ispromo = 0, where [SaleCount] is only 15 not zero values. 例如,通过ispromo = 0,我有20个观测值,其中[SaleCount]只有15个而不是零值。 Calculate the coefficient, divide the total number of days by the number of days where there were no zero sales by salecount.
计算系数,将总天数除以没有零销售的天数除以salecount。 15/20 = 0.75.
15/20 = 0.75。 It must be done for each strata (groups) [CustomerName] + [ItemRelation] + [DocumentYear] separately.
必须分别对每个阶层(组)[CustomerName] + [ItemRelation] + [DocumentYear]执行此操作。 So, if within group such coefficient is greater than 0.71, then such groups should be written into a table mytab1 if less, then in mytab2
因此,如果该组内的系数大于0.71,则应将这些组写入表mytab1(如果较小),则应写入mytab2
How to do it? 怎么做?
data sample 数据样本
Dt CustomerName ItemRelation SaleCount DocumentNum DocumentYear IsPromo
2018-02-19 00:00:00.000 1 11683 0 999 2018 0
2018-02-20 00:00:00.000 1 11683 0 999 2018 0
2018-02-21 00:00:00.000 1 11683 0 999 2018 0
2018-02-22 00:00:00.000 1 11683 0 999 2018 0
2018-02-23 00:00:00.000 1 11683 0 999 2018 0
2018-02-24 00:00:00.000 1 11683 1339 999 2018 0
2018-02-25 00:00:00.000 1 11683 81 999 2018 0
2018-02-26 00:00:00.000 1 11683 487 999 2018 0
2018-02-27 00:00:00.000 1 11683 861 999 2018 0
2018-02-28 00:00:00.000 1 11683 546 999 2018 0
2018-03-01 00:00:00.000 1 11683 722 999 2018 0
2018-03-02 00:00:00.000 1 11683 890 999 2018 0
2018-03-03 00:00:00.000 1 11683 1128 999 2018 0
2018-03-04 00:00:00.000 1 11683 81 999 2018 0
2018-03-05 00:00:00.000 1 11683 884 999 2018 0
2018-03-06 00:00:00.000 1 11683 3675 999 2018 0
2018-03-07 00:00:00.000 1 11683 3780 999 2018 0
2018-03-08 00:00:00.000 1 11683 3178 999 2018 0
2018-03-09 00:00:00.000 1 11683 1749 999 2018 0
2018-03-10 00:00:00.000 1 11683 1243 999 2018 0
this stratum has coef=0,75 it goes to mytab1 这个阶层的系数= 0.75,它进入mytab1
and this stratum 这个阶层
Dt CustomerName ItemRelation SaleCount DocumentNum DocumentYear IsPromo
2018-02-19 00:00:00.000 2 11684 0 999 2018 0
2018-02-20 00:00:00.000 2 11684 0 999 2018 0
2018-02-21 00:00:00.000 2 11684 0 999 2018 0
2018-02-22 00:00:00.000 2 11684 0 999 2018 0
2018-02-23 00:00:00.000 2 11684 0 999 2018 0
2018-02-24 00:00:00.000 2 11684 1339 999 2018 0
2018-02-25 00:00:00.000 2 11684 81 999 2018 0
2018-02-26 00:00:00.000 2 11684 487 999 2018 0
2018-02-27 00:00:00.000 2 11684 861 999 2018 0
2018-02-28 00:00:00.000 2 11684 546 999 2018 0
2018-03-01 00:00:00.000 2 11684 722 999 2018 0
has coef 0,545454545 11 days by zero category of ispromo and 6 day was with non zero obs by salescount. 按ispromo的零类别具有0,545454545的11天,按salescount的非零obs具有6天。
A simple method uses avg()
with group by
: 一个简单的方法将
avg()
与group by
:
select CustomerName, ItemRelation, DocumentYear,
avg( case when IsPromo > 0 then 1.0 end) as promo_ratio
from action.dbo.fc
group by CustomerName, ItemRelation, DocumentYear;
You can then use having avg( case when IsPromo > 0 then 1.0 end) > 0.71
for your filtering. 然后,可以使用
having avg( case when IsPromo > 0 then 1.0 end) > 0.71
进行过滤。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.