简体   繁体   English

在多年事件的R表中,如何标记另一个事件的首次出现?

[英]In an R table of multi-year events, how can you mark the first occurrence of a different event?

I'm working with an R table right now that stores events that take place over multiple years. 我现在正在使用R表来存储发生多年的事件。 Each of these events has several related variables. 这些事件中的每一个都有几个相关的变量。 For one of these variable, I want to pull out the first occurrence of it in each event series. 对于这些变量之一,我想在每个事件系列中将其首次出现。

events <- c('event1', 'event1', 'event1', 'event1', 'event1', 'event2', 'event2', 'event2')  
years <- c('2000', '2001', '2002', '2003', '2004', '1994', '1995', '1996') 
variable1 <- c('False', 'False', 'False', 'True', 'True', 'False', 'False', 'True')  
df <- data.frame(events, years, variable1)  

I want to figure out a way to generate a new column, First_occurrence, that looks like this: 我想找出一种生成新列First_occurrence的方法,如下所示:

Event   Year   Variable1   First_occurrence
event1  1994   False       False                              
event1  1995   True        True
event1  1996   True        False
event1  1997   True        False
event2  2000   False       False
event2  2001   False       False
event2  2002   True        True

How would I go about creating that "First_occurrence" column? 我将如何创建“ First_occurrence”列?

Thanks! 谢谢!

Something like this, if you're ok with tidyverse solutions: 如果您对tidyverse解决方案没问题,就可以这样:

library(dplyr)
df %>% 
 arrange(events,years) %>% 
 group_by(events) %>% 
 mutate(first_occ = row_number() == which(variable1 == 'True')[1])

# A tibble: 8 x 4
# Groups:   events [2]
  events years variable1 first_occ
   <chr> <chr>     <chr>     <lgl>
1 event1  2000     False     FALSE
2 event1  2001     False     FALSE
3 event1  2002     False     FALSE
4 event1  2003      True      TRUE
5 event1  2004      True     FALSE
6 event2  1994     False     FALSE
7 event2  1995     False     FALSE
8 event2  1996      True      TRUE

A small note: stuff like this could potentially be slightly smoother to code if your "True"/"False" variables are actually booleans, rather than character or factors. 一个小小的注意事项:如果您的“ True” /“ False”变量实际上是布尔值而不是字符或因素,则类似这样的代码可能会稍微更平滑。 In that case boolean comparisons are easier and you can leverage the automatic coercion to 1/0 in many cases. 在这种情况下,布尔比较会更容易,并且在许多情况下,您可以将自动强制转换为1/0。

Note that this answer may yield undesirable results if an event has no occurrences at all. 请注意,如果一个事件根本没有发生,则此答案可能会产生不希望的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM