简体   繁体   English

从一个数据框的不同列创建一个新列,该条件以另一个数据框的另一列为条件

[英]Create a new column from different columns of one data frame conditioned on another column from another data frame

Suppose I have two data frame 假设我有两个数据框

df1 <- data.frame(A = 1:6, B = 7:12, C = rep(1:2, 3))
df2 <- data.frame(C = 1:2, D = c("A", "B"))

I want to create a new column E in df1 whose value is based on the values of Column C, which can then be connected to Column D in df2. 我想在df1中创建一个新列E,其值基于列C的值,然后可以将其连接到df2中的列D。 For example, the C value in the first row of df1 is "1". 例如,df1的第一行中的C值为“ 1”。 And value 1 of column C in df2 corresponds to "A" of Column D, so the value E created in df2 should from column "A", ie, 1. 并且df2中C列的值1对应于D列的“ A”,因此在df2中创建的值E应该来自“ A”列,即1。

As suggested by Select values from different columns based on a variable containing column names , I can achieve this by two steps: 正如基于包含列名的变量从不同列中选择值所建议的,我可以通过两个步骤来实现:

setDT(df1)
setDT(df2)
df3 <- df1[df2, on = "C"] # step 1 combines the two data.tables
df3[, E := .SD[[.BY[[1]]]], by = D] # step 2

My question is: Could we do this in one step? 我的问题是:我们可以一步一步做到吗? Furthermore, as my data is relatively large, the first step in this original solution takes a lot time. 此外,由于我的数据相对较大,因此此原始解决方案的第一步需要大量时间。 Could we do this in a faster way? 我们可以更快地做到这一点吗? Any suggestions? 有什么建议么?

您可以尝试这样做,C列可以指示df1中的列值

setDT(df1) df1[, e := eval(parse(text = names(df1)[C])), by = 1:nrow(df1)] df1

ABC e 1: 1 7 1 1 2: 2 8 2 8 3: 3 9 1 3 4: 4 10 2 10 5: 5 11 1 5 6: 6 12 2 12

Here's how I would do it: 这是我的处理方式:

df1[df2, on=.(C), D := i.D][, E := .SD[[.BY$D]], by=D]

   A  B C D  E
1: 1  7 1 A  1
2: 2  8 2 B  8
3: 3  9 1 A  3
4: 4 10 2 B 10
5: 5 11 1 A  5
6: 6 12 2 B 12

This adds the columns to df1 by reference instead of making a new table and so I guess is more efficient than building df3 . 这通过引用将列添加到df1而不是创建新表,因此我想比构建df3更有效。 Also, since they're added to df1 , the rows retain their original ordering. 另外,由于将它们添加到df1 ,因此这些行保留其原始顺序。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 通过根据另一个数据框中列的值从一个数据框中提取列来创建新数据框 - creating a new data frame by extracting columns from one data frame based on the value of column in another data frame 有没有办法在数据框中创建一列,条件是连接不从另一个文件中引入该列,因为它有 0 行? - Is there a way to create a column in a data frame conditioned to a join not bringing this column from another file because it had 0 rows? 根据 R 中的列名创建一个包含来自另一个数据框中的列的新数据框 - create a new data frame with columns from another data frame based on column names in R 向 data.frame 添加一个新列,其值是一列的随机样本并以另一列为条件 - Adding a new column to data.frame whose values are random samples of one column and conditioned on another 如果前两列都匹配,则将数据框的一列中的值添加到另一数据框的新列中 - adding values from one column of a data frame into a new column of another dataframe if the first two columns in both match 在另一个数据框的列上匹配一个数据框的列,如果匹配则添加一个新列 - Matching a column from a data frame on the columns of another data frame and if they match add a new column 使用另一个数据框中的唯一值和分配给列的相应值创建具有列名的新数据框 - Create New Data Frame with Column Names from Unique Values in another Data Frame and Corresponding Values Assigned to Column 如何将一个数据框中的特定行的总和输出到另一个数据框中的新列? - How to output the sum of specific rows from one data frame to a new column in another data frame? R:使用另一个数据框的映射在一个数据框中创建一个新列 - R: Create a new column in a data frame using a mapping from another data frame R:如何用另一个数据框中的“ countif”值在数据框中创建新列? - R: How to create a new column in a data frame with “countif” values from another data frame?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM