简体   繁体   English

如何为年月序列创建和保存子集数据框

[英]How to create and save subset dataframes for sequence of year-month

I would like to filter from a dataframe observations for a given year-month and then save it as a separate dataframe and name it with the respective year-month.我想从给定年月的 dataframe 观察中过滤,然后将其另存为单独的 dataframe 并用相应的年月命名。

I would be grateful if someone could suggest a more efficient code than the one below.如果有人能提出比下面更有效的代码,我将不胜感激。 Also, this code is not filtering correctely the observations.此外,此代码未正确过滤观察结果。

data <- data.frame(year  = c(rep(2012,12),rep(2013,12),rep(2014,12),rep(2015,12),rep(2016,12)),
                   month = rep(1:12,5),
                   info = seq(60)*100)
years <- 2012:2016
months <- 1:12
for(year in years){
  for(month in months){
    
    data_sel <- data %>%
      filter(year==year & month==month)
    
    if(month<10){
      month_alt <- paste0("0",month) # months 1-9 should show up as 01-09
    }
    
    Newname <- paste0(year,month_alt,'_','data_sel')
    assign(Newname, data_sel)
  }
}

The output I am looking to get is below (separate objects containing data from a given year-month):我希望得到的 output 如下(包含来自给定年月数据的单独对象):

> ls()
 [1] "201201_data_sel" "201202_data_sel" "201203_data_sel" "201204_data_sel"
 [5] "201205_data_sel" "201206_data_sel" "201207_data_sel" "201208_data_sel"
 [9] "201209_data_sel" "201301_data_sel" "201302_data_sel" "201303_data_sel"
[13] "201304_data_sel" "201305_data_sel" "201306_data_sel" "201307_data_sel"
[17] "201308_data_sel" "201309_data_sel" "201401_data_sel" "201402_data_sel"
[21] "201403_data_sel" "201404_data_sel" "201405_data_sel" "201406_data_sel"
[25] "201407_data_sel" "201408_data_sel" "201409_data_sel" "201501_data_sel"
[29] "201502_data_sel" "201503_data_sel" "201504_data_sel" "201505_data_sel"
[33] "201506_data_sel" "201507_data_sel" "201508_data_sel" "201509_data_sel"
[37] "201601_data_sel" "201602_data_sel" "201603_data_sel" "201604_data_sel"
[41] "201605_data_sel" "201606_data_sel" "201607_data_sel" "201608_data_sel"
[45] "201609_data_sel" "data"            "data_sel"        "month"          
[49] "month_alt"       "months"          "Newname"         "year"           
[53] "years"

You could do:你可以这样做:

library(dplyr)

g <- data %>% 
  mutate(month = sprintf("%02d", month)) %>% 
  group_by(year, month) 

setNames(group_split(g), with(group_keys(g), paste0("data_sel_", year, month))) %>% 
  list2env(envir = .GlobalEnv)

Starting an object name with a digit is not allowed in R , so in paste0 "data_sel_" is first.R中不允许使用数字启动 object 名称,因此在paste0中“data_sel_”是第一个。

As mentioned in the comments it might be better to not pipe to list2env and store the output as a list with named elements.如评论中所述,最好不要将pipe 到list2env并将 output 存储为具有命名元素的列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM