简体   繁体   中英

Using a for loop in R to loop through the name of dataframes

I have data on mergers for 20 years for various firms. I have used a "for" loop in R to separate data for each year which gives me 20 data frames in the global environment. Each data frame is identified by its year: Merger2000 to Merger2019 for 20 years. Now I want to write another for loop to find the unique companies in each data frame (that is, unique firms in each year). Each company is identified by a unique company code (co_code). I know how to do this for each year separately. For example, for the year 2000, I would do something like:

uniquemerger2000 <- Merger2000 %>% distinct(co_code, .keep_all = TRUE)

How do I run a for loop to enable this operation for all years (that is from 2000-2019)? There is some indexing required in the code but I am not sure how to operationalise this in a loop.

Any help would be appreciated. Thanks!

Usually it is better to keep data in one dataframe or a list instead of multiple such objects in global environment.

You can create one list object ( list_data ) bringing all the dataframes together and use lapply / map to keep unique rows from each dataframe.

library(dplyr)
library(purrr)

list_data <- mget(paste0('Merger', 2000:2019))
result <- map(list_data, ~.x %>% distinct(co_code, .keep_all = TRUE))

Or in base R:

result <- lapply(list_data, function(x) x[!duplicated(x$co_code), ])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM