I have the following DFs, as an example
df1<-read.table (text=" ID1 speed ID2 Time ID3 Income 4 60 5 100 3 300 3 80 2 90 7 400 2 90 6 100 6 600 ", header=TRUE)
df2<-read.table (text=" ID Colour CA NA DC NO 2 YY N12 A B-12 3 BN M18 B B-17 6 RY M20 E B-22 4 PN M22 F B-27 7 BY M11 G B-32 ", header=TRUE)
The expected outcome is ID1 speed1 Colour1 CA1 NA1 DC1 NO1 ID2 Time Colour2 CA2 NA2 DC2 NO2 ID3 Income Colour3 CA3 NA3 DC3 NO3 4 60 PN M22 F B-27 5 100 NA xx xx xx xx 3 300 xx xx xx xx xx 3 80 BN M18 B B-17 2 90 Y xx xx xx xx 7 400 xx xx xx xx xx 2 90 YY N12 A B-12 6 100 Y xx xx xx xx 6 600 xx xx xx xx xx
From the input and expected, it seems that we need a join individually on the 'ID' columns from 'df1' with that of 'ID' on 'df2'. Get the 'ID' column names ('nm1'), and the names of the 'df2' that are not found in 'df1'. Loop over the sequence of ID columns, do a join and assign ( :=
) the values of 'nm2' columns by joining on the 'ID' with the corresponding 'ID1', 'ID2', 'ID3' from 'df1'
library(data.table)
df3 <- copy(df1)
nm1 <- grep("^ID\\d+$", names(df1), value = TRUE)
nm2 <- setdiff(setdiff(names(df2), names(df1)), "ID")
setDT(df3)
for(i in seq_along(nm1)) {
df3[df2, paste0(nm2, i) := mget(nm2), on = setNames("ID", nm1[i])][]
}
-output
df3
ID1 speed ID2 Time ID3 Income Colour1 CA1 NA.1 DC1 NO1 Colour2 CA2 NA.2 DC2 NO2 Colour3 CA3 NA.3 DC3 NO3
1: 4 60 5 100 3 300 P N M22 F B-27 <NA> <NA> <NA> <NA> <NA> B N M18 B B-17
2: 3 80 2 90 7 400 B N M18 B B-17 Y Y N12 A B-12 B Y M11 G B-32
3: 2 90 6 100 6 600 Y Y N12 A B-12 R Y M20 E B-22 R Y M20 E B-22
or another option is reshape to 'long' format with pivot_longer
, do a join with left_join
and then reshape back to 'wide' format with pivot_wider
library(dplyr)
library(tidyr)
library(readr)
df1 %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = starts_with('ID'), values_to = 'ID') %>%
left_join(df2) %>%
mutate(name = parse_number(name)) %>%
pivot_wider(names_from = name, values_from = ID:NO, names_sep="") %>%
select(-rn)
-output
# A tibble: 3 x 21
speed Time Income ID1 ID2 ID3 Colour1 Colour2 Colour3 CA1 CA2 CA3 NA.1 NA.2 NA.3 DC1 DC2 DC3 NO1 NO2 NO3
<int> <int> <int> <int> <int> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 60 100 300 4 5 3 P <NA> B N <NA> N M22 <NA> M18 F <NA> B B-27 <NA> B-17
2 80 90 400 3 2 7 B Y B N Y Y M18 N12 M11 B A G B-17 B-12 B-32
3 90 100 600 2 6 6 Y R R Y Y Y N12 M20 M20 A E E B-12 B-22 B-22
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.