简体   繁体   English

根据另一列的先前观察值创建新变量

[英]Create new variable based on prior observation value from another column

I am constructing a new variable that's value is contingent on the prior row in another column. 我正在构造一个新变量,该变量的值取决于另一列中的上一行。 Therefore, the order of the data is important. 因此,数据的顺序很重要。 This is how my data currently looks 这是我当前数据的外观

ID  Cong  Comm    Y
 1   52    3      0
 1   53    3      0
 1   54    3      1
 1   53    4      1
 2   50    2      1
 2   50    7      1
 3   48    4      1
 4   48    3      1
 4   48    7      0
 4   49    7      1

I would like to create a new variable called Y2. 我想创建一个名为Y2的新变量。 If the observation's Y=0, then Y2 in the same observation should equal 1. If the following row's has Y=0, then add 1 to the previous Y2 value (the Y2 value for this observation should equal 2). 如果观测值的Y = 0,则同一观测值中的Y2应等于1。如果下一行的Y = 0,则将上一个Y2值加1(此观测值的Y2值应等于2)。 Continue this process until Y=1, add 1, and then stop the process. 继续此过程,直到Y = 1,加1,然后停止该过程。 Essentially, the new variable counts up until the other column's value equals "1" and repeats the process. 本质上,新变量递增计数,直到另一列的值等于“ 1”并重复该过程。

This is what it should look like: 它应该是这样的:

ID  Cong  Comm    Y   Y2
 1   52    3      0   1
 1   53    3      0   2
 1   54    3      1   3 
 1   53    4      1   1
 2   50    2      1   1
 2   50    7      1   1
 3   48    4      1   1
 4   48    3      1   1
 4   48    7      0   1 
 4   49    7      1   2

Here is my sample dataframe. 这是我的示例数据框。

data.frame(
ID = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 4L, 4L), 
Cong = c(52L, 53L, 54L, 53L, 50L, 50L, 48L, 48L, 48L, 49L), 
Comm = c(3L, 3L, 3L, 4L, 2L, 7L, 4L, 3L, 7L, 7L),
Y=c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L))

Would a loop or if-else command be the best way to tackle this? 循环或if-else命令是否是解决此问题的最佳方法? I tried an if-else statement, but my code did not work. 我尝试了if-else语句,但是我的代码无法正常工作。 Any recommendations would be great. 任何建议都会很棒。

You can do this like this, supposing your data.frame is df : 假设您的data.frame是df ,您可以这样做:

y = df$Y
bool=y==c(0, head(y, -1))
y[which(bool %in% F)] = 0

df$Y2 = ifelse(y==0, f7(!y), 1)

#   ID Cong Comm Y Y2
#1   1   52    3 0  1
#2   1   53    3 0  2
#3   1   54    3 1  3
#4   1   53    4 1  1
#5   2   50    2 1  1
#6   2   50    7 1  1
#7   3   48    4 1  1
#8   4   48    3 1  1
#9   4   48    7 0  1
#10  4   49    7 1  2

The trick is done with: 技巧是通过以下方式完成的:

f7 <- function(x){ tmp<-cumsum(x);tmp-cummax((!x)*tmp)}

Entirely defined in this great post: 完全定义在这篇很棒的文章中:

count how many consecutive values are true 计算多少个连续值是正确的

Finally this solution is entirely vectorized, no loop. 最后,此解决方案是完全矢量化的,没有循环。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从不同列中的另一个观察值中减去一个观察值,并将特定值添加到结果中以在 R 的第一列中创建一个新观察值 - Subtract an observation from another in different column and add a specific value to the result to create a new observation in the first column in R 基于另一个变量创建新列 - Create new column based on another variable 根据新创建的列的先前值填充新列 - Populating a new column based the value of the prior value of the newly created column 基于先前观察的新变量 - New Variable Based on Previous Observation 在R中生成一个新变量,其中第n个观察值取决于另一个列的第n-1次观察 - Generating a new variable in R where the nth observation depends on the n-1th observation of another column 如何基于另一个变量的值创建一个新变量? - How to create a new variable based on another variable's value? 如何根据另一列的相等性对列的值求和并创建新的 dataframe - How to sum value of a column based on equality from another column and create a new dataframe R中,如何根据上一年的观测值做一个新变量,如果去年没有观测值,就把它设为NA - How to make a new variable based on the observation from the previous year, and make it NA if there is no observation in the last year in R 从列索引值创建新变量 - Create New Variable from Column Index Value R:如何在数据框中创建一个新列,其中主要计算观察值对变量具有相同值的次数 - R: how to create a new column in a dataframe where is cardinally counted how many times an observation has the same value for a variable
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM