简体   繁体   English

SQL中按条件进行数据分区

[英]data partitioning by criterion in SQL

How can I split the data by the criterion? 如何按标准分割数据?

SELECT [Dt]
      , [CustomerName]

      , [ItemRelation]

      , [SaleCount]
      , [DocumentNum]
      , [DocumentYear]
      , [IsPromo]
      , [CustomerType]
  FROM [Action]. [Dbo]. [FC]

[IsPromo] has the values ​​0 and 1. I need to divide the data by the number of sales by [SaleCount] for the zero category ispromo! [IsPromo]的值为0和1。对于零类别ispromo,我需要将数据除以[SaleCount]的销售数量!

For example, i have 20 observations by ispromo = 0, where [SaleCount] is only 15 not zero values. 例如,通过ispromo = 0,我有20个观测值,其中[SaleCount]只有15个而不是零值。 Calculate the coefficient, divide the total number of days by the number of days where there were no zero sales by salecount. 计算系数,将总天数除以没有零销售的天数除以salecount。 15/20 = 0.75. 15/20 = 0.75。 It must be done for each strata (groups) [CustomerName] + [ItemRelation] + [DocumentYear] separately. 必须分别对每个阶层(组)[CustomerName] + [ItemRelation] + [DocumentYear]执行此操作。 So, if within group such coefficient is greater than 0.71, then such groups should be written into a table mytab1 if less, then in mytab2 因此,如果该组内的系数大于0.71,则应将这些组写入表mytab1(如果较小),则应写入mytab2

How to do it? 怎么做?

data sample 数据样本

Dt  CustomerName    ItemRelation    SaleCount   DocumentNum DocumentYear    IsPromo
2018-02-19 00:00:00.000 1   11683   0   999 2018    0
2018-02-20 00:00:00.000 1   11683   0   999 2018    0
2018-02-21 00:00:00.000 1   11683   0   999 2018    0
2018-02-22 00:00:00.000 1   11683   0   999 2018    0
2018-02-23 00:00:00.000 1   11683   0   999 2018    0
2018-02-24 00:00:00.000 1   11683   1339    999 2018    0
2018-02-25 00:00:00.000 1   11683   81  999 2018    0
2018-02-26 00:00:00.000 1   11683   487 999 2018    0
2018-02-27 00:00:00.000 1   11683   861 999 2018    0
2018-02-28 00:00:00.000 1   11683   546 999 2018    0
2018-03-01 00:00:00.000 1   11683   722 999 2018    0
2018-03-02 00:00:00.000 1   11683   890 999 2018    0
2018-03-03 00:00:00.000 1   11683   1128    999 2018    0
2018-03-04 00:00:00.000 1   11683   81  999 2018    0
2018-03-05 00:00:00.000 1   11683   884 999 2018    0
2018-03-06 00:00:00.000 1   11683   3675    999 2018    0
2018-03-07 00:00:00.000 1   11683   3780    999 2018    0
2018-03-08 00:00:00.000 1   11683   3178    999 2018    0
2018-03-09 00:00:00.000 1   11683   1749    999 2018    0
2018-03-10 00:00:00.000 1   11683   1243    999 2018    0

this stratum has coef=0,75 it goes to mytab1 这个阶层的系数= 0.75,它进入mytab1

and this stratum 这个阶层

Dt  CustomerName    ItemRelation    SaleCount   DocumentNum DocumentYear    IsPromo
2018-02-19 00:00:00.000 2   11684   0   999 2018    0
2018-02-20 00:00:00.000 2   11684   0   999 2018    0
2018-02-21 00:00:00.000 2   11684   0   999 2018    0
2018-02-22 00:00:00.000 2   11684   0   999 2018    0
2018-02-23 00:00:00.000 2   11684   0   999 2018    0
2018-02-24 00:00:00.000 2   11684   1339    999 2018    0
2018-02-25 00:00:00.000 2   11684   81  999 2018    0
2018-02-26 00:00:00.000 2   11684   487 999 2018    0
2018-02-27 00:00:00.000 2   11684   861 999 2018    0
2018-02-28 00:00:00.000 2   11684   546 999 2018    0
2018-03-01 00:00:00.000 2   11684   722 999 2018    0

has coef 0,545454545 11 days by zero category of ispromo and 6 day was with non zero obs by salescount. 按ispromo的零类别具有0,545454545的11天,按salescount的非零obs具有6天。

A simple method uses avg() with group by : 一个简单的方法将avg()group by

select CustomerName, ItemRelation, DocumentYear,
       avg( case when IsPromo > 0 then 1.0 end) as promo_ratio
from action.dbo.fc
group by CustomerName, ItemRelation, DocumentYear;

You can then use having avg( case when IsPromo > 0 then 1.0 end) > 0.71 for your filtering. 然后,可以使用having avg( case when IsPromo > 0 then 1.0 end) > 0.71进行过滤。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM