简体   繁体   English

如何在 R 中对数据框中的月份列进行分组

[英]How to group months column in a data frame in R

I have a data frame in the following fashion:我有以下方式的数据框:

Year <- 1948:2017
Jan<- rnorm(70)
Feb<- rnorm(70)
Mar<- rnorm(70)
Apr<- rnorm(70)
May<- rnorm(70)
Jun<- rnorm(70)
Jul<- rnorm(70)
Aug<- rnorm(70)
Sep<- rnorm(70)
Oct<- rnorm(70)
Nov<- rnorm(70)
Dec<- rnorm(70)
test_df <- cbind.data.frame(Year, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)
head(test_df)
########Console result


    Year        Jan        Feb        Mar         Apr
1 1948 -0.5918300  0.0497792 -0.9302350  0.73162688
2 1949 -1.2731259  0.8933090  0.2340527  1.03077077
3 1950 -0.3727786 -0.5680272  1.4439980  0.53150414
4 1951  0.6520741 -1.4229818 -0.9700416 -0.07151535
5 1952  0.4296101 -0.2294352  1.0863566  1.58652232
6 1953  0.3334147 -0.5386016  1.3432490  1.91005906
          May        Jun         Jul         Aug
1  0.28268233  0.7870373 -0.06178119 -0.14469371
2 -0.02048683 -1.4834607 -0.17926819 -0.38662117
3  0.24659095  0.4929837  0.79430914  0.03486687
4 -0.60123934  1.1304690 -0.13452649 -1.07814801
5  1.39161546  0.6827090  0.54729206  0.50188908
6 -0.53882956 -0.3246258  0.09602686 -2.35509441
         Sep        Oct        Nov         Dec
1  2.0492817  0.6185466  2.0427045 -0.06097253
2  0.7804505 -0.3416864 -1.5192509  2.01911948
3  1.9193976 -0.3120360  1.5646020 -0.04911313
4 -0.1147404 -0.3593639  0.5186583  1.39936930
5  2.4481574 -1.2349037 -0.3519640  0.58429371
6  0.6639531 -0.4471403  0.7071486 -1.02036467

I require to group random months such as JanFeb , JanMar or AprFeb or MarMayNov , like so.我需要对随机月份进行分组,例如JanFebJanMarAprFebMarMayNov ,就像这样。 The grouping of months could be anything (Many number of possibilities and combinations).月份的分组可以是任何东西(许多可能性和组合)。 And when I group this months their values should be averages as for example, JanFeb value should be the mean of the values of Jan and Feb or MarMayNov value should be the mean of Mar , Nov and May .当我将这几个月分组时,它们的值应该是平均值,例如, JanFeb值应该是JanFeb值的平均值,或者MarMayNov值应该是MarNovMay的平均值。 How to approach this problem?如何解决这个问题? Any help is appreciated.任何帮助表示赞赏。 Thanks.谢谢。

Edit编辑

Lets say for simplicity that I only want to group 2 months or 3 months at most not more than that.为简单起见,我只想将 2 个月或最多 3 个月分组。

We can create all possible combinations of names using lapply and combn .我们可以使用lapplycombn创建所有可能的名称组合。 For each combination find the average of selected columns in one column and combine such columns together in one dataframe.对于每个组合,在一列中找到所选列的平均值,并将这些列组合在一个 dataframe 中。

cols <- names(test_df)[-1]

result <- do.call(cbind, lapply(2:length(cols), function(x)
  do.call(cbind, combn(cols, x, function(y) 
    setNames(data.frame(rowMeans(test_df[y])), 
              paste0(y, collapse = "")), simplify = FALSE))))

If you want to combine only 3 months at most, change 2:length(cols) to 2:3 in lapply .如果您最多只想合并 3 个月, lapply中的2:length(cols)更改为2:3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM