简体   繁体   中英

Extracting data from one column and storing in another based on a condition

The data I have looks like below-

ID      category     Type_1     Type_2     Match?      Quantity
123       1            A          B       no match        2
123       2            A          B       no match        1
123       3            A          B       no match        6
123       4            A          B       no match        6
123       5            A          B       no match        9
123       6            A          B       no match        1
456       1            A          A        match          6
456       2            A          A        match          4
456       3            A          A        match          4
456       4            A          A        match          3
456       5            A          A        match          0
456       6            A          A        match          1

I want to restructure this table by doing the following-

1) When the category is 3 , I want to create another column cat_3 and store the Type_2 for category 3 underneath that. Create another column Quantity_3 and store the quantity corresponding to 3 underneath that.

2) For all other categories other than 3 , I want to create another column Other_categories and store Type_1 corresponding to other categories underneath that. Create another column Quantity_for_other_categories and store the sum of quantities corresponding to all other categories underneath that.

The result should look like below-

ID    cat_3     Quantity_3     Other_categories   Quantity_for_other_categories    Match?
123     B           6                 A                       19                   no match
456     A           4                 A                       14                   match

Below is for BigQuery Standard SQL

#standardSQL
SELECT id, cat_3, Quantity_3, Other_categories, Quantity_for_other_categories, Match
FROM (
  SELECT id, 
    Type_2 AS cat_3,
    Quantity AS Quantity_3,
    Match
  FROM `project.dataset.table`
  WHERE category = 3
  ORDER BY id
) FULL OUTER JOIN (
  SELECT id, 
    Type_1 AS  Other_categories,
    SUM(Quantity) AS Quantity_for_other_categories,  
    Match
  FROM `project.dataset.table`
  WHERE category != 3
  GROUP BY id, Type_1, Match
)
USING(id, Match)
-- ORDER BY id   

If to apply to sample data from your question - output is

Row id  cat_3   Quantity_3  Other_categories    Quantity_for_other_categories   Match    
1   123 B       6           A                   19                              no match     
2   456 A       4           A                   14                              match    

You can do conditional aggregation. Assuming that type_1 , type_2 and match are fixed for each id , you could do:

select 
    id,
    type_2 cat_3,
    sumif(quantity, category = 3) quantity_3,
    type_1 other_categories,
    sumif(quantity, category <> 3) quantity_for_other_catgories,
    match
from mytable
group by id, type_2, type_1, match

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM