[英]use cut in R so that unmatched intervals are included
我有一個這樣的數據集:
sum_col city scen model time_period chill_season
110.02 NY RCP_8 bcc 2076_2099 season_2085_2086
91.26 NY RCP_8 bcc 2076_2099 season_2086_2087
91.05 NY RCP_8 bcc 2076_2099 season_2087_2088
74.96 NY RCP_8 bcc 2076_2099 season_2088_2089
77.97 NY RCP_8 bcc 2076_2099 season_2089_2090
109.05 NY RCP_8 bcc 2076_2099 season_2090_2091
我想cut
sum_col
列並計算多少次,這些值落入每個間隔bks = c(-300, seq(20, 75, 5), 300)
sum_col
bks = c(-300, seq(20, 75, 5), 300)
。
但是,當我嘗試以下操作時:
result <- dt %>%
mutate(thresh_range = cut(sum_col, breaks = bks)) %>%
group_by(time_period, thresh_range, model, scen, city) %>%
summarize(no_years = n_distinct(chill_season, na.rm = FALSE)) %>%
data.table()
我的結果看起來像:
time_period thresh_range model scen city no_years
2076_2099 (70,75] bcc RCP_8 NY 1
2076_2099 (75,300] bcc RCP_8 NY 5
因此,不會創建小於70
的間隔,例如(20, 25), (25, 30)
(因為數據中沒有行屬於這些間隔)。
無論如何,是否要告訴cut
,在這些間隔內返回零?
再次請注意,該行類似於以下內容:
a_value_leass_than_70_here NY RCP_8 bcc 2076_2099 chill_2076_2077
其對應的sum_col
小於70的數據在數據中不存在,但是,我想知道對於這樣一個不存在的數據是否可能, cut
可以創建0
或NA
來告訴我們NY的溫度,而這些參數確實沒有落在(20, 25)
間隔內。
最重要的是,我想查看每個給定參數集(model, scen, etc)
落在每個間隔(20, 25), (25,30), etc.
范圍內的時間,
如果有任何其他的建議是cut
的作品,這是偉大的,以及。
您可以使用tidyr
包中的complete
函數為丟失的數據組合創建NA
行:
library(tidyr)
result <- dt %>%
mutate(thresh_range = cut(sum_col, breaks = bks)) %>%
complete(time_period, thresh_range, model, scen, city) %>%
group_by(time_period, thresh_range, model, scen, city) %>%
summarize(no_years = n_distinct(chill_season, na.rm = TRUE))
result
# # A tibble: 13 x 6
# # Groups: time_period, thresh_range, model, scen [?]
# time_period thresh_range model scen city no_years
# <chr> <fct> <chr> <chr> <chr> <int>
# 1 2076_2099 (-300,20] bcc RCP_8 NY 0
# 2 2076_2099 (20,25] bcc RCP_8 NY 0
# 3 2076_2099 (25,30] bcc RCP_8 NY 0
# 4 2076_2099 (30,35] bcc RCP_8 NY 0
# 5 2076_2099 (35,40] bcc RCP_8 NY 0
# 6 2076_2099 (40,45] bcc RCP_8 NY 0
# 7 2076_2099 (45,50] bcc RCP_8 NY 0
# 8 2076_2099 (50,55] bcc RCP_8 NY 0
# 9 2076_2099 (55,60] bcc RCP_8 NY 0
# 10 2076_2099 (60,65] bcc RCP_8 NY 0
# 11 2076_2099 (65,70] bcc RCP_8 NY 0
# 12 2076_2099 (70,75] bcc RCP_8 NY 1
# 13 2076_2099 (75,300] bcc RCP_8 NY 5
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.