简体   繁体   中英

Group by multiple non-numeric columns in dplyr

I am trying to group_by multiple, only non-numeric columns , in dplyr . The aim is then to create two columns with the percentage and absolute changes.

Right now, I can achieve this by listing all non-numeric column names by hand, and then mutate :

df=df %>%
    group_by(CPTY_CLASS, SCENARIO, INSTITUTE, METHOD_DEC, Deaf) %>%
    mutate(PERC_CH = EUR / EUR[which.min(TIME)]-1,
           ABS_CH = EUR - EUR[which.min(TIME)]
           )

However, if there were more non-numeric columns, listing 900 variables by hand, or choosing the column numbers would not work.

Here is the reproducible dataset:

df=structure(list(SCENARIO = c("AC", "AC", "AC", "AC", "AC", "AC", 
    "AC", "AC", "AC", "AC", "AC", "AC"), INSTITUTE = c("BCR", 
    "BCR", "BCR", "BCR", "BCR", "BCR", "BCR", "BCR", "BCR", "BCR", 
    "BCR", "BCR"), METHOD_DEC = c("BIL", "BIL", "BIL", 
    "BIL", "BIL", "BIL", "CRL", "CRL", "CRL", "CRL", "CRL", 
    "CRL"), CPTY_CLASS = c("SME", "SME", "SME", "SME", "SME", 
    "SME", "BANK", "BANK", "BANK", "BANK", "BANK", "BANK"), Deaf = c("Y", 
    "Y", "Y", "N", "N", "N", "Y", "Y", "Y", "N", "N", "N"), TIME = c(2021L, 
    2022L, 2023L, 2021L, 2022L, 2023L, 2021L, 2022L, 2023L, 2021L, 
    2022L, 2023L), EUR = c(13446.7, 16460.6727685, 19510.2132, 
    79951.1120192, 80847.03547, 84940.1414854, 0, 1047.150372256, 
    302.308772901, 104609.421568, 107719.773397, 103466.689156
    )), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
    ))
group_v <- names(df)[rev(seq(1,ncol(df))[!as.logical(sapply(df, is.numeric))])]

df=df %>%  group_by(!!!syms(group_v))  %>%
mutate(PERC_CH = EUR / EUR[which.min(TIME)]-1
       ,
       ABS_CH = EUR - EUR[which.min(TIME)]
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM