简体   繁体   English

如何在R中找到dataframe的字符串行的最大值和最小值?

[英]How to find the max and min values of string rows of a dataframe in R?

For each row of my data, I want to get the min and max values which are originally stored as a character.对于我的每一行数据,我想获取最初存储为字符的最小值和最大值。 For example, consider the following data:例如,考虑以下数据:

df <- data.frame(id=c(1:3),
                 yr=c("2000,2009,1999,2022","2019,2018,2006,2007","1998,2012,2000,2020"))

Output needed: Output 需要:

id                   yr  min_yr    max_yr
1   2000,2009,1999,2022    1999      2022
2   2019,2018,2006,2007    2006      2019
3   1998,2012,2000,2020    1998      2020

This will work also for years like 860 , 1543 , 2023 , ...这也适用于86015432023等年份,...

df[c("min_yr", "max_yr")] <-
   t(sapply(strsplit(df$yr, ","), \(x) range(as.numeric(x))))

df
#  id                  yr min_yr max_yr
#1  1 2000,2009,1999,2022   1999   2022
#2  2 2019,2018,2006,2007   2006   2019
#3  3 1998,2012,2000,2020   1998   2020

Here's one-liner in base R that also works on any number.这是基数 R 中的一行代码,它也适用于任何数字。

df[c('min_yr', 'max_yr')] <- t(sapply(df$yr, \(x) range(scan(text=x, sep = ','))))

Resulting in导致

df
#>   id                  yr min_yr max_yr
#> 1  1 2000,2009,1999,2022   1999   2022
#> 2  2 2019,2018,2006,2007   2006   2019
#> 3  3 1998,2012,2000,2020   1998   2020
df$min_yr=as.numeric(unlist(lapply(strsplit(df$yr,","),min)))
df$max_yr=as.numeric(unlist(lapply(strsplit(df$yr,","),max)))

  id                  yr min_yr max_yr
1  1 2000,2009,1999,2022   1999   2022
2  2 2019,2018,2006,2007   2006   2019
3  3 1998,2012,2000,2020   1998   2020

using dplyr and purrr :使用dplyrpurrr

library(dplyr)
library(purrr)
mutate(df, strsplit(yr, ",") |>
           map(as.numeric) |>
           map(range) |>
           map_dfr(setNames, c("min", "max")))

##>   id                  yr  min  max
##> 1  1 2000,2009,1999,2022 1999 2022
##> 2  2 2019,2018,2006,2007 2006 2019
##> 3  3 1998,2012,2000,2020 1998 2020

library(stringr) library(dplyr) df %>% rowwise() %>% mutate(min_yr = min(as.numeric(str_split_1(yr, ","))), max_yr = max(as.numeric(str_split_1(yr, ",")))) id yr min_yr max_yr <int> <chr> <dbl> <dbl> 1 1 2000,2009,1999,2022 1999 2022 2 2 2019,2018,2006,2007 2006 2019 3 3 1998,2012,2000,2020 1998 2020

Using pmin/pmax from base R - read the yr column with read.csv to create a data.frame and then use pmin/pmax使用base R中的pmin/pmax - 使用 read.csv 读取 yr 列以创建read.csv ,然后使用 pmin/pmax

d1 <- read.csv(text = df$yr, header = FALSE)
df$min_yr <- do.call(pmin, d1)
df$max_yr <- do.call(pmax, d1)

-output -输出

> df
  id                  yr min_yr max_yr
1  1 2000,2009,1999,2022   1999   2022
2  2 2019,2018,2006,2007   2006   2019
3  3 1998,2012,2000,2020   1998   2020

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在数据框中的变量中找到一组的最大值和最小值的差 - How to find the difference of max & min values in one group in a variable in a dataframe R:当 max &lt;= min 时,如何替换(切换)数据帧中一行中的最大值和最小值? - R: How to to replace(switch) the max and min values in a row in a dataframe when max <= min? 如何在R的一列中的值序列中找到最大值和最小值? - How to find max and min within sequence of values in a column in R? 表示数据帧中的行值,不包括R中的最小值和最大值 - mean from row values in a dataframe excluding min and max values in R 如何在R数据帧中查找与字符串关联的两行并减去其相互的列值 - How to find both rows associated with a string in an R dataframe and subtract their mutual column values 基于标识列中的最大值和最小值(在R中)来子集数据框 - Subset a dataframe based on identifying max and min values in a column (in R) 从 R 中数据帧的列中获取 `n` 个最大值或最小值 - Getting `n` max or min values from column of a dataframe in R R:为什么观星者使用此 dataframe 返回不正确的最小值和最大值? - R: why does stargazer return incorrect values for min and max with this dataframe? Label 时间最小值和最大值出现在 R dataframe - Label time of day min and max values occur in R dataframe 如何计算 R 中 dataframe 中一组其他唯一值的最小/最大项? - How can I count the min/max item for for a set of otherwise unique values within a dataframe in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM