[英]Column name with the min and max values in a dataset in R
我有這個數據集:
Year January February March April May June July August
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2018 45 51 63 61 79 85 88 85
2 2017 51 60 65 69 75 82 86 84
3 2016 47 55 61 68 72 84 87 85
... with 20 more rows
我想得到與每行的最小值和最大值相對應的月份,以及最大值和最小值之間的差異。 這是我的最小值和最大值代碼,
x <- colnames(data)[apply(data[,c(2:9)],1,which.max)]
y <- colnames(data)[apply(data[,c(2:9)],1,which.min)]
data$MaxMonth <- x
data$MinMonth <- y
但是,對於某些 which.min function,它給了我作為 output 的年份。
Year January February March April May June July August MaxMonth MinMonth Diff
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2018 45 51 63 61 79 85 88 85 July January 43
2 2017 51 60 65 69 75 82 86 84 July Year 35
3 2016 47 55 61 68 72 84 87 85 July Year 40
... with 20 more rows
我們可以使用pivot_longer
重塑為長格式,按 'Year' 進行分組,獲取對應於 'value' 的max/min
的列名(with which.max/which.min
),然后與原始數據連接
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = -1) %>%
group_by(Year) %>%
summarise(maxMonth = name[which.max(value)],
minMonth = name[which.min(value)]) %>%
left_join(df, .)
無需執行 3 個應用功能。 你可以這樣做:
nms <- names(df)[-1]
n <- seq(nrow(df))
maxMonth = max.col(df[-1])
minMonth = max.col(-df[-1])
diff <- df[-1][cbind(n, maxMonth)] - df[-1][cbind(n, minMonth)]
cbind(df, maxMonth = nms[maxMonth], minMonth = nms[minMonth], diff)
Year January February March April May June July August maxMonth minMonth diff
1 2018 45 51 63 61 79 85 88 85 July January 43
2 2017 51 60 65 69 75 82 86 84 July January 35
3 2016 47 55 61 68 72 84 87 85 July January 40
我認為對您帖子的評論突出了問題所在
你應該寫
x <- colnames(data)[2:9][apply(data[,c(2:9)],1,which.max)]
y <- colnames(data)[2:9][apply(data[,c(2:9)],1,which.min)]
data$MaxMonth <- x
data$MinMonth <- y
像這樣更好用嗎?
library(tidyverse)
df %>%
mutate(max_month = pmap(across(January:August), ~ names(c(...)[which.max(c(...))])),
min_month = pmap(across(January:August), ~ names(c(...)[which.min(c(...))]))
) %>%
unnest(cols = c(max_month, min_month)) %>%
rowwise() %>%
mutate(Diff = max(c_across(January:August)) - min(c_across(January:August)))
Output:
Year January February March April May June July August max_month min_month Diff
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl>
1 2018 45 51 63 61 79 85 88 85 July January 43
2 2017 51 60 65 69 75 82 86 84 July January 35
3 2016 47 55 61 68 72 84 87 85 July January 40
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.