简体   繁体   中英

compare columns of groups dataframe for equality

My goal her is to compare a string or numeric among a group grouped by ID. So if for example var1 both were "NORMAL" a new column would say TRUE or FALSE. I know I can summarise_all() but I need it to be new column for another project. Also I would like this comparison to work for a numeric as well. All have to be exactly the same in the column of choice. Some of the groups have more than 2 members.

df <- structure(list(ID = c("A1.1234567", "A1.12345"), 
                 var1 = c("NORMAL", "NORMAL"), 
                 var2 = c("NORMAL", "NORMAL"), 
                 var3 = c("NORMAL", "NORMAL"), 
                 var4 = c("NORMAL", "NORMAL"), 
                 var5 = c("NORMAL", "NORMAL"), 
                 var6 = c("NORMAL", "NORMAL"), 
                 var7 = c("25", "25"), 
                 var8 = c("6, 9)),

            .Names = c("ID", "var1", "var2", "var3", "var4", "var5", "var6", "var7", "var8"), 
            class = "data.frame", row.names = c(NA, -2L))

I want it to look like

         ID   var1   var2   var3   var4   var5   var6 var7 var8 var7.true va8.true
A1.1234567 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL  25    6    TRUE   FALSE
A1.1234567 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL  25    9    TRUE   FALSE

My only idea was to mutate it but I cant seem to compare them correctly

You can use mutate_at (as opposed to mutate_all ) in order to not include ID since we are not grouping by it, and define the name of the new variables to be created so that it does not overwrite the existing ones, ie

df %>% 
 mutate_at(vars(-ID), funs(new = ifelse(all(. == 'NORMAL'), TRUE, FALSE)))

which gives

  ID var1 var2 var3 var4 var5 var6 var7 var8 var1_new var2_new var3_new var4_new var5_new var6_new var7_new var8_new 1 A1.1234567_10 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE 2 A1.1234567_20 NORMAL NORMAL NORMAL NORMAL NORMAL NORMAL ABNORMAL NORMAL TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE 

EDIT As per your comment, there are a few ways to get equality in all elements. I went with the length of the unique value being 1 (If all are the same), ie

mutate_at(df, vars(-ID), funs(new = length(unique(.)) == 1))

BONUS Now you don't need to use ifelse since we are not defining a value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM