I have a dataframe containing panel data with patent and economic information in the 2012-2020 time period. I have a time invariant variable, investment_year , which is the year in which a certain company has received an initial investment. patent_applications is the annual number of patents filed by a certain company. Company A, for example, filed 5 patents in 2018, 2 in 2019, etc.
company_name investment_year year patent_applications A 2018 2020 7 A 2018 2019 2 A 2018 2018 5 . . . . . . . . . . . . A 2018 2012 4 B 2015 2020 10 B 2015 2019 3 B 2015 2018 7 . . . . . . . . . . . .
I would like to create a variable which contains the number of applications at t+2, where t is the investment year. So, for example, for Company A the number of applications at t+2 ( patent_applications_t2 ) would be 7, as its investment year (2018) + 2 equals 2020.
I tried the line of code below, but it does not produce the correct result.
df$patent_applications_t2 <- df$patent_applications[df$Year == df$Investment_Year + 2]
There must be a better way to accomplish what you are looking for. I got the following.
library(tidyverse)
tbl <- tribble(~company_name, ~investment_year, ~year, ~patent_applications,
"A", 2018, 2020, 7,
"A", 2018, 2019, 2,
"A", 2018, 2018, 5,
"A", 2018, 2012, 4,
"B", 2015, 2020, 10,
"B", 2015, 2019, 3,
"B", 2015, 2018, 7
)
tbl %>% group_by(company_name) %>%
arrange(investment_year,year) %>%
mutate(t2 = ifelse(year - investment_year <= 1 & year - investment_year >=0, 1, 0)) %>%
mutate(cumulative_application = t2*cumsum(patent_applications*t2)) %>%
ungroup() %>%
arrange(company_name) %>%
select(company_name,investment_year,year,patent_applications,cumulative_application)
you get this result:
# A tibble: 7 x 5
company_name investment_year year patent_applications cumulative_application
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 2018 2012 4 0
2 A 2018 2018 5 5
3 A 2018 2019 2 7
4 A 2018 2020 7 0
5 B 2015 2018 7 0
6 B 2015 2019 3 0
7 B 2015 2020 10 0
I chose to show the cumulative application but you can easily only show the second entry only.
Another solution (probably better) would be to create a function using within()
. Hope this helps you a bit.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.