简体   繁体   English

如何使用现有列中的值创建新列,以告知新值将来自哪一列?

[英]How to create new column using values in an existing column to tell which column the new values will come from?

Here is an example data. 这是一个示例数据。

testdata <- data.frame(A = c(1,0,1,1,0,0),
                   B = c(2,0,0,0,0,1),
                   D0 = c("A","A","B","C","A","A"),
                   D1 = c("B","C","C","A","B","B"),
                   D2 = c("C", NA,NA,NA,NA,NA),
                   stringsAsFactors = F)

What I wanted to do is make a new column based on columns A and B (eg, columns Aprime and Bprime ). 我想要做的是根据A列和B列创建一个新列(例如,列AprimeBprime )。 The values that will be placed in the new column will be from columns with D (eg, D0, D1, and D2 ). 将放置在新列中的值将来自具有D列(例如, D0, D1, and D2 )。 And the value in columns A and B tells which D column to pick. 并且A列和B列中的值指示要选择的D列。 So for example, for the new column Aprime , the first value will be "B" because the first row of A is 1, thus it should take the first row of the D1 column. 因此,例如,对于新列Aprime ,第一个值将是"B"因为A的第一行是1,因此它应该采用D1列的第一行。 For the first row of Bprime, it should have "C" , because the first B is 2, thus it should take the first D2 value. 对于Bprime的第一行,它应该具有"C" ,因为第一个B是2,因此它应该取第一个D2值。 The result should be something like this: 结果应该是这样的:

  A B D0 D1   D2 Aprime Bprime
1 1 2  A  B    C      B      C
2 0 0  A  C <NA>      A      A
3 1 0  B  C <NA>      C      B
4 1 0  C  A <NA>      A      C
5 0 0  A  B <NA>      A      A
6 0 1  A  B <NA>      A      B

I used the ifelse statements below to come up with the above results: 我使用下面的ifelse语句来得出上述结果:

testdata$Aprime <- ifelse(testdata$A == 0, testdata$D0, ifelse(testdata$A == 1, testdata$D1, testdata$D2))
testdata$Bprime <- ifelse(testdata$B == 0, testdata$D0, ifelse(testdata$B == 1, testdata$D1, testdata$D2))

However, I would like a more generic one because the D columns are not fixed (eg, there can be D3 up to D20). 但是,我想要一个更通用的,因为D列不是固定的(例如,可以有D3到D20)。 How can I do this one without writing an ifelse for the Ds greater than 0 (ie., D1 and so on)? 如果没有为大于0的Ds写一个ifelse(即,D1等),我怎么能这样做呢?

TIA. TIA。

Here is a base R method using matrix subsetting to select the values and lapply to loop through columns A and B. 这是一个基本的R方法,使用矩阵子集来选择值,并使用lapply循环遍历列A和B.

testdata[c("aprime", "bprime")] <-
      lapply(testdata[c("A", "B")],
             function(x) testdata[, 3:5][cbind(seq_len(nrow(testdata)), x + 1)])

The left side provides names for the new variables. 左侧提供新变量的名称。 On the right, the first argument of lapply provides the set of variables to run through. 在右边,lapply的第一个参数提供了要运行的变量集。 The second argument of lapply , testdata[, 3:5][cbind(seq_len(nrow(testdata)), x + 1)] first subsets the data.frame into the indexing columns, (D0-D2), and then provides a matrix for subsetting using cbind . lapply的第二个参数, testdata[, 3:5][cbind(seq_len(nrow(testdata)), x + 1)]首先将data.frame子集化到索引列(D0-D2)中,然后提供用于使用cbind进行子集化的矩阵。 The row indices are selected with seq_len..nrow and the columns are selected from the varaibles provided in the first argument of lapply . 使用seq_len..nrow选择行索引,并从lapply的第一个参数中提供的变量中选择lapply

This returns 这回来了

testdata
  A B D0 D1   D2 aprime bprime
1 1 2  A  B    C      B      C
2 0 0  A  C <NA>      A      A
3 1 0  B  C <NA>      C      B
4 1 0  C  A <NA>      A      C
5 0 0  A  B <NA>      A      A
6 0 1  A  B <NA>      A      B

For more information on matrix subsetting, take a look at ?"[" . 有关矩阵子集的更多信息,请查看?"["

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 通过将R中数据框的现有列中的值分组来创建新列 - create a new column by grouping values from an existing column of a dataframe in R 如何创建一个新列,其中值来自一个数字减去一列? - How to create a new column where the values come from a number minus a column? 具有多个条件的 which 语句在现有列中创建新值 - which statement with multiple conditions to create new values in existing column 使用现有列中的值创建新的数据框列 - Create new dataframe columns using values in an existing column 使用现有列中的值在数据框中创建新列 - Create new columns in a dataframe with values from existing column 创建一个新列,其中包含另一列的值以及第三列的NA值 - create a new column which holds values of another column plus NA values from a third column 如何使用索引为 1 的另一列中的值创建新列 - How to create a new column with values from another column with index-1 从R中具有不同值的现有列创建一个新列 - Creating a new column from existing column with different values in R 如何编写R函数以从任何两个现有列的条件值在任何数据帧中创建新列? - How to write an R function to create a new column in any dataframe from conditional values of two existing columns? 如何根据 R 中现有列的值创建新列? - How do I create a new column based on the values of an existing column in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM