简体   繁体   中英

how to to create new variable based on first entry of another variable

I have created df

  gvkey  year 
1  1004  1965    
2  1004  1966       
3  1004  1967      
4  1005  1950        
5  1005  1951      
...

gvkey = company id

My dataframe contains more than 1,200 unique gvkeys with different starting values for year .

I want to create a column in df called ipo_date which equals the first entry in the year column for a particular gvkey .

  gvkey  year  ipo_date
1  1004  1965    1965
2  1004  1966    1965
3  1004  1967    1965
4  1005  1950    1950
5  1005  1951    1950
...

Afterwards I want to delete all duplicate rows for ipo_date and delete the column year

  gvkey  ipo_date
1  1004   1965
2  1005   1950
...

Thank you so much!!!

You could select the first row of every gvkey

library(dplyr)
df %>%  group_by(gvkey) %>%  slice(1L) %>%  rename_at(2, ~"ipo_date")

#  gvkey ipo_date
#  <int>    <int>
#1  1004     1965
#2  1005     1950

Using the same logic in data.table and base R, we can do

library(data.table)
setDT(df)[, .SD[1L], gvkey]

and

aggregate(year~gvkey, df, head, 1)
library(data.table)
df = as.data.table(df)
df = df[order(gvkey,year)]
df[,`:=`(ipo_date=min(year)),by=c("gvkey")][,`:=`(year=NULL)]
df = unique(df)

This should do the trick:

library(tidyverse)
df %>% 
  group_by(gvkey) %>% 
  mutate(ipo_date = min(year)) %>% 
  select(-year) %>% 
  distinct()

Output:

# A tibble: 2 x 2
# Groups:   gvkey [2]
  gvkey ipo_date
  <dbl>    <dbl>
1  1004     1965
2  1005     1950

Here is a one line solution:

library(plyr)
df <- data.frame(gvkey = c(1004, 1004, 1004, 1005, 1005),
                 year = c(1965, 1966, 1967, 1950, 1951))

df_agg <- ddply(df, c('gvkey'), summarise, ipo_date = min(year))
> df_agg
  gvkey ipo_date
1  1004     1965
2  1005     1950

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM