簡體   English   中英

如何組合列的成員,將它們收集在數據框中並給它們一個新名稱,在 R 中?

[英]How to combine member of a column, collect them in a data frame and give them a new name, in R?

我想根據我的舊數據框創建一個新的數據框,並組合特定列的成員,同時給它們一個新名稱:例如,這是我的舊數據框:

df <- structure(list(ID= c("x1", "x1", "x1", "x1", "x1", "x1", "x2", "x2", "x2", "x2", "x2", "x2", "x3", "x3", "x3", "x3", "x3", "x3", "x1", "x1", "x1", "x1", "x1", "x1", "x2", "x2", "x2", "x2", "x2", "x2", "x3", "x3", "x3", "x3", "x3", "x3"),
col1=c("a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a1","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2","a2"),
col2= c("a", "b", "c", "d", "e", "f", "a", "b", "c", "d", "e", "f","a", "b", "c", "d", "e", "f","a", "b", "c", "d", "e", "f", "a", "b", "c", "d", "e", "f","a", "b", "c", "d", "e", "f"),
col3= c(2,13,1,21,0,5,3,0,6,4,50,0,0,0,0,9,5,0,51,3,6,0,0,9,89,4,29,1,4,17,6,16,9,1,0,0)), 
                class = "data.frame", row.names = c(NA,-36L))

對於新的 dataframe 我想有一個基於col2的新列,所以合並abc ,其中有abc中的任何一個,將其命名為abc.1 在有de的地方合並de ,命名為de.5 ,最后在有f的地方命名為f.10 對於new.col3 ,它們在舊col3中的價值的總和。

結果將是:

df2<- structure(list(col1=c("a1","a1","a1","a2","a2","a2"),
new.col2= c("abc.1", "de.5", "f.10", "abc.1", "de.5", "f.10"),
new.col3=c(25,89,5,213,6,26)),
                class = "data.frame", row.names = c(NA,-6L))

使用case_when創建組,然后使用summarise按組折疊行並按組計算col3的總和。

library(dplyr)
df %>% 
  group_by(col1, gp = case_when(col2 %in% c("a", "b", "c") ~ 1,
                        col2 %in% c("d", "e") ~ 5,
                        col2 == "f" ~ 10)) %>% 
  summarise(new.col2 = paste(paste0(unique(col2), collapse = ""), unique(gp), sep = "."),
            new.col3 = sum(col3))

output

# A tibble: 6 × 4
# Groups:   col1 [2]
  col1     gp new.col2 new.col3
  <chr> <dbl> <chr>       <dbl>
1 a1        1 abc.1          25
2 a1        5 de.5           89
3 a1       10 f.10            5
4 a2        1 abc.1         213
5 a2        5 de.5            6
6 a2       10 f.10           26

使用data.table

df[,new.col2:=fcase(col2 %chin% c('a','b','c'),'abc.1',
            col2 %chin% c('d','e'),'de.5',
            col2 == 'f','f.10')][,.(new.col3=sum(col3)),by=.(col1,new.col2)]

Output

   col1 new.col2 new.col3
1:   a1    abc.1       25
2:   a1     de.5       89
3:   a1     f.10        5
4:   a2    abc.1      213
5:   a2     de.5        6
6:   a2     f.10       26

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM