[英]How to find the difference of max & min values in one group in a variable in a dataframe
[英]How to find the max and min values of string rows of a dataframe in R?
對於我的每一行數據,我想獲取最初存儲為字符的最小值和最大值。 例如,考慮以下數據:
df <- data.frame(id=c(1:3),
yr=c("2000,2009,1999,2022","2019,2018,2006,2007","1998,2012,2000,2020"))
Output 需要:
id yr min_yr max_yr
1 2000,2009,1999,2022 1999 2022
2 2019,2018,2006,2007 2006 2019
3 1998,2012,2000,2020 1998 2020
這也適用於860
、 1543
、 2023
等年份,...
df[c("min_yr", "max_yr")] <-
t(sapply(strsplit(df$yr, ","), \(x) range(as.numeric(x))))
df
# id yr min_yr max_yr
#1 1 2000,2009,1999,2022 1999 2022
#2 2 2019,2018,2006,2007 2006 2019
#3 3 1998,2012,2000,2020 1998 2020
這是基數 R 中的一行代碼,它也適用於任何數字。
df[c('min_yr', 'max_yr')] <- t(sapply(df$yr, \(x) range(scan(text=x, sep = ','))))
導致
df
#> id yr min_yr max_yr
#> 1 1 2000,2009,1999,2022 1999 2022
#> 2 2 2019,2018,2006,2007 2006 2019
#> 3 3 1998,2012,2000,2020 1998 2020
df$min_yr=as.numeric(unlist(lapply(strsplit(df$yr,","),min)))
df$max_yr=as.numeric(unlist(lapply(strsplit(df$yr,","),max)))
id yr min_yr max_yr
1 1 2000,2009,1999,2022 1999 2022
2 2 2019,2018,2006,2007 2006 2019
3 3 1998,2012,2000,2020 1998 2020
使用dplyr
和purrr
:
library(dplyr)
library(purrr)
mutate(df, strsplit(yr, ",") |>
map(as.numeric) |>
map(range) |>
map_dfr(setNames, c("min", "max")))
##> id yr min max
##> 1 1 2000,2009,1999,2022 1999 2022
##> 2 2 2019,2018,2006,2007 2006 2019
##> 3 3 1998,2012,2000,2020 1998 2020
library(stringr) library(dplyr) df %>% rowwise() %>% mutate(min_yr = min(as.numeric(str_split_1(yr, ","))), max_yr = max(as.numeric(str_split_1(yr, ",")))) id yr min_yr max_yr <int> <chr> <dbl> <dbl> 1 1 2000,2009,1999,2022 1999 2022 2 2 2019,2018,2006,2007 2006 2019 3 3 1998,2012,2000,2020 1998 2020
使用base R
中的pmin/pmax
- 使用 read.csv 讀取 yr 列以創建read.csv
,然后使用 pmin/pmax
d1 <- read.csv(text = df$yr, header = FALSE)
df$min_yr <- do.call(pmin, d1)
df$max_yr <- do.call(pmax, d1)
-輸出
> df
id yr min_yr max_yr
1 1 2000,2009,1999,2022 1999 2022
2 2 2019,2018,2006,2007 2006 2019
3 3 1998,2012,2000,2020 1998 2020
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.