I have 2 data frames. df1 is like
V1 V2 V3 V4 V5
1 1 7506 10949 3 0.2284710
2 1 28272 29965 147 0.6033058
3 1 36598 37518 843 0.7459016
4 1 37512 40365 52 0.4121901
5 1 48795 50666 150 0.8050847
6 1 50660 52365 92 0.6995614
7 1 52850 54453 1337 0.8991597
8 1 54447 54527 279 0.9858824
9 1 54816 64015 2 0.2787356
10 1 70664 74349 17 0.5549451
And df2 is like this :
1 1 1 7512
2 1 7506 10949
3 1 10943 13175
4 1 13169 20070
5 1 20064 28278
6 1 28272 29965
7 1 29959 36604
8 1 36598 37518
9 1 37512 40365
10 1 40359 48801
i would like to combine them in a new df3 in the way that if there is match it will take the value of df1$V4 and df1$V5 if not it will be NA or 0. The final data frame should be like :
1 1 7512 0 0
1 7506 10949 3 0.2284710
1 10943 13175 0 0
1 13169 20070 0 0
1 20064 28278 0 0
1 28272 29965 147 0.6033058
1 29959 36604 0 0
1 36598 37518 843 0.7459016
1 37512 40365 52 0.4121901
1 40359 48801 0 0
......
......
etc until the end of the files
Could you please help me . Which function is doing this ?
Thank you in advance
First just to make it easier to reproduce your example it is nice to include your data like this:
df1 <- structure(list(V1 = 1:10, V2 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), V3 = c(7506L, 28272L, 36598L, 37512L, 48795L, 50660L,
52850L, 54447L, 54816L, 70664L), V4 = c(10949L, 29965L, 37518L,
40365L, 50666L, 52365L, 54453L, 54527L, 64015L, 74349L), V5 = c(3L,
147L, 843L, 52L, 150L, 92L, 1337L, 279L, 2L, 17L), V6 = c(0.228471,
0.6033058, 0.7459016, 0.4121901, 0.8050847, 0.6995614, 0.8991597,
0.9858824, 0.2787356, 0.5549451)), class = "data.frame", row.names = c(NA,
-10L))
df2 <- structure(list(V1 = 1:10, V2 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), V3 = c(1L, 7506L, 10943L, 13169L, 20064L, 28272L,
29959L, 36598L, 37512L, 40359L), V4 = c(7512L, 10949L, 13175L,
20070L, 28278L, 29965L, 36604L, 37518L, 40365L, 48801L)), class = "data.frame", row.names = c(NA,
-10L))
Then generate an index with your two keys in each dataset and match the the positions
index <- match(paste0(df2$V3, df2$V4), paste0(df1$V3, df1$V4))
Then use that index to fill in the values in your second dataframe:
df2$V5 <- df1$V5[index]
df2$V6 <- df1$V6[index]
You might have different column names in you data of course since I just quickly copy/pasted your data and got the row names and stuff as well.
df2
V1 V2 V3 V4 V5 V6
1 1 1 1 7512 NA NA
2 2 1 7506 10949 3 0.2284710
3 3 1 10943 13175 NA NA
4 4 1 13169 20070 NA NA
5 5 1 20064 28278 NA NA
6 6 1 28272 29965 147 0.6033058
7 7 1 29959 36604 NA NA
8 8 1 36598 37518 843 0.7459016
9 9 1 37512 40365 52 0.4121901
10 10 1 40359 48801 NA NA
If I understand correctly, the OP requests to right join df1
with df2
on key columns V1
, V2
, and V3
. The result will consist of all rows of df2
with columns V4
and V5
appended from df1
where the keys match.
One possible implementation is with data.table :
library(data.table)
setDT(df1)[setDT(df2), on = .(V1, V2, V3)]
V1 V2 V3 V4 V5 1: 1 1 7512 NA NA 2: 1 7506 10949 3 0.2284710 3: 1 10943 13175 NA NA 4: 1 13169 20070 NA NA 5: 1 20064 28278 NA NA 6: 1 28272 29965 147 0.6033058 7: 1 29959 36604 NA NA 8: 1 36598 37518 843 0.7459016 9: 1 37512 40365 52 0.4121901 10: 1 40359 48801 NA NA
library(data.table)
df1 <- fread("rn V1 V2 V3 V4 V5
1 1 7506 10949 3 0.2284710
2 1 28272 29965 147 0.6033058
3 1 36598 37518 843 0.7459016
4 1 37512 40365 52 0.4121901
5 1 48795 50666 150 0.8050847
6 1 50660 52365 92 0.6995614
7 1 52850 54453 1337 0.8991597
8 1 54447 54527 279 0.9858824
9 1 54816 64015 2 0.2787356
10 1 70664 74349 17 0.5549451", drop = 1L)
df2 <- fread("rn V1 V2 V3
1 1 1 7512
2 1 7506 10949
3 1 10943 13175
4 1 13169 20070
5 1 20064 28278
6 1 28272 29965
7 1 29959 36604
8 1 36598 37518
9 1 37512 40365
10 1 40359 48801", drop = 1L)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.