简体   繁体   中英

grouping months by winter season instead of year in R

I have got the following data frame

year <- c(1949, 1950, 1950, 1950, 1951, 1951, 1951, 1952, 1952, 1952, 1953, 1953, 1953)
month <- c(12, 1, 2, 12, 1, 2, 12, 1, 2, 12, 1, 2, 12)
df <- data.frame(year, month)
 df
   year month
1  1949    12
2  1950     1
3  1950     2
4  1950    12
5  1951     1
6  1951     2
7  1951    12
8  1952     1
9  1952     2
10 1952    12
11 1953     1
12 1953     2
13 1953    12

where month 1 is January and month 12 is December. now I would like to group them by winter season. this would mean that for example month 12 from year 1949 would be grouped with month 1 and 2 from 1950 because they are part of 1 winter season. the ideal outcome would be:

 year month winterseason
1  1949    12            1
2  1950     1            1
3  1950     2            1
4  1950    12            2
5  1951     1            2
6  1951     2            2
7  1951    12            3
8  1952     1            3
9  1952     2            3
10 1952    12            4
11 1953     1            4
12 1953     2            4
13 1953    12            5 

any ideas?

If this is already arranged by the month

df$winterseason <- cumsum(df$month == 12)
df$winterseason
#[1] 1 1 1 2 2 2 3 3 3 4 4 4 5

This would label each season by a yearqtr class object giving the year and quarter of the last month of each winter. We convert the year and month to a "yearmon" class object and add 1/12 which pushes each month to the next month. Then convert that to a "yearqtr" class object.

library(zoo)

transform(df, season = as.yearqtr(as.yearmon(paste(year, month, sep = "-")) + 1/12))

giving:

   year month  season
1  1949    12 1950 Q1
2  1950     1 1950 Q1
3  1950     2 1950 Q1
4  1950    12 1951 Q1
5  1951     1 1951 Q1
6  1951     2 1951 Q1
7  1951    12 1952 Q1
8  1952     1 1952 Q1
9  1952     2 1952 Q1
10 1952    12 1953 Q1
11 1953     1 1953 Q1
12 1953     2 1953 Q1
13 1953    12 1954 Q1

Note that if season is a variable containing the season column values then as.integer(season) and cycle(season) can be used to extract the year and quarter numbers so, for example, if there were also non-winter rows then cycle(season) == 1 , would identify those in the winter.

Try

year <- c(1949, 1950, 1950, 1950, 1951, 1951, 1951, 1952, 1952, 1952, 1953, 1953, 1953)
month <- c(12, 1, 2, 12, 1, 2, 12, 1, 2, 12, 1, 2, 12)
df <- data.frame(year, month)
df$season <- ifelse(month == 12,year+1,year) - min(year)

This is not very elegant, but produces your ideal outcome

   year month season
1  1949    12      1
2  1950     1      1
3  1950     2      1
4  1950    12      2
5  1951     1      2
6  1951     2      2
7  1951    12      3
8  1952     1      3
9  1952     2      3
10 1952    12      4
11 1953     1      4
12 1953     2      4
13 1953    12      5

Here is an alternative: using magrittr and data.table

df$winterYear <- ifelse(month %in% c(11,12),year+1,year) %>% data.table::rleidv()

result:

   year month winterYear
1  1949    12          1
2  1950     1          1
3  1950     2          1
4  1950    12          2
5  1951     1          2
6  1951     2          2
7  1951    12          3
8  1952     1          3
9  1952     2          3
10 1952    12          4
11 1953     1          4
12 1953     2          4
13 1953    12          5

Side note: To be save you can/should sort your data by year,month .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM