简体   繁体   中英

Assign new column based on a value chosen from an id in another dataframe

I have a df with 100k+ obs and 12 cols. One of those cols is some kind of an id I need to use to make another column based in another df. This other df has only 50 obs and one col is the id and the value I need to copy to the first df.

I am not able to code this. Here is a partial df (both) I am showing only the relevant cols for this question

DF1 (100k+ obs)

id
010100
010100
010100
010100
010100
010100
010200
010200
010200
010201
010201
010201
010201
010201
010201
010201
010300
010300
010300
010300
010300
010400
010400
010400
010500
010500
010501
010501
010501
010600
010600
010600
010600

Here is the second df with the values and id

id         val
010100  1
010200  2
010201  2
010300  3
010400  4
010500  5
010501  6
010600  7

What I need is to have val in a new column in df depending on the id of both df as follows:

id  New
010100  1
010100  1
010100  1
010100  1
010100  1
010100  1
010200  2
010200  2
010200  2
010201  2
010201  2
010201  2
010201  2
010201  2
010201  2
010201  2
010300  3
010300  3
010300  3
010300  3
010300  3
010400  4
010400  4
010400  4
010500  5
010500  5
010501  6
010501  6
010501  6
010600  7
010600  7
010600  7
010600  7

Any idea is appreciated. Thanks for your time.

Regards

merge is what you want, or alternatively you may notice some speed benefits by using data.table package:

df1 <- data.frame(id = 1:3)
df2 <- data.frame(id = rep(1:3, each = 2), val = rnorm(6))

> merge(df1, df2)
  id        val
1  1  0.9462113
2  1 -1.7835754
3  2 -1.1604525
4  2  0.2498844
5  3 -1.5187111
6  3  0.5921281

library(data.table)
dt1 <- data.table(df1, key = "id")
dt2 <- data.table(df2, key = "id")

> dt1[dt2]
     id        val
[1,]  1  0.9462113
[2,]  1 -1.7835754
[3,]  2 -1.1604525
[4,]  2  0.2498844
[5,]  3 -1.5187111
[6,]  3  0.5921281

See the help page for ?merge for details on the types of joins available, matching columns, etc. The data.table FAQ is probably the best place to learn the nuances of that package: http://datatable.r-forge.r-project.org/datatable-faq.pdf

You might try something like this:

df3 <- merge(df1, df2, by="id", all = TRUE)

You need to set all = TRUE or only df2 rows will exist in df3.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM