简体   繁体   中英

R dplyr Replace unique values from one data frame with unique values from other data frame with unequal row numbers

I'd like to replace the unique ID's in one df with the unique ID's from another df. Let's say df_long contains time series data per trial and df_short only the average values over time.

  1. How can I mutate the values in ID if I have to use ID for my grouping in group_by?
  2. How do I apply a list of unique values from unique(df_long$ID) to unique(df_short$ID) if the data frames have unequal row numbers?

How would you do this using dplyr?

#let's assume this df contains averaged trials 
df_short <- data.frame(ID = rep(1:4,each=9), 
                 Trial= rep(1:3,12),  
                 Session = rep(rep(1:3,each=3),4) ) 

df_long <- data.frame(ID = rep(c(11,13,18,19),each=3*3*3), 
                 Trial= rep(rep(c(1,2,3),each=3),4*3),  
                 Time = rep(1:3,3*4*3),
                 Session = rep(rep(1:3,each=9),4))    

    df_short[1:15,]
       ID Trial Session
    1   1     1       1
    2   1     2       1
    3   1     3       1
    4   1     1       2
    5   1     2       2
    6   1     3       2
    7   1     1       3
    8   1     2       3
    9   1     3       3
    10  2     1       1
    11  2     2       1
    12  2     3       1
    13  2     1       2
    14  2     2       2
    15  2     3       2

df_long[1:15,]
   ID Trial Time Session
1  11     1    1       1
2  11     1    2       1
3  11     1    3       1
4  11     2    1       1
5  11     2    2       1
6  11     2    3       1
7  11     3    1       1
8  11     3    2       1
9  11     3    3       1
10 11     1    1       2
11 11     1    2       2
12 11     1    3       2
13 11     2    1       2
14 11     2    2       2
15 11     2    3       2

Result

 ID Trial Session
1  11     1       1
2  11     2       1
3  11     3       1
4  11     1       2
5  11     2       2
6  11     3       2
7  11     1       3
8  11     2       3
9  11     3       3
10 13     1       1
11 13     2       1
12 13     3       1
13 13     1       2
14 13     2       2
15 13     3       2

If the replacement is to be done "respectively" (in order of occurrence), here are two options. In both cases I will create a new column so that the result can be verified--I'll leave dropping the old column and renaming the new column up to you.

# Option 1: with factor conversion
df_short %>%
  mutate(
    new_ID = factor(ID, levels = unique(ID), labels = unique(df_long$ID)),
    new_ID = as.numeric(as.character(new_ID)) # convert to numeric
)

# Option 2: make a look-up table and join
id_lookup = tibble(ID = unique(df_short$ID), new_ID = unique(df_long$ID))
df_short %>% left_join(id_lookup, by = "ID")

If your data is grouped by the ID column, before doing these operations yo ushould ungroup() , apply these functions, and group_by() again, if necessary. (The join will work fine with grouped data, but you won't be able to drop the old ID column if the data is grouped.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM