I have a character field in a dataframe that contains numbers eg (0.5,3.5,7.8,2.4).
For every record I am trying to extract the largest value from the string and put it in a new column.
eg
x csi
1 0.5, 6.7, 2.3
2 9.5, 2.6, 1.1
3 0.7, 2.3, 5.1
4 4.1, 2.7, 4.7
The desired output would be:
x csi csi_max
1 0.5, 6.7, 2.3 6.7
2 9.5, 2.6, 1.1 9.5
3 0.7, 2.3, 5.1 5.1
4 4.1, 2.7, 4.7 4.7
I have had various attempts ...with my latest attempt being the following - which provides the maximum csi score from the entire column rather than from the individual row's csi numbers...
library(stringr)
numextract <- function(string){
str_extract(string, "\\-*\\d+\\.*\\d*")
}
df$max_csi <- max(numextract(df$csi))
Thank you
We can use tidyverse
library(dplyr)
library(tidyr)
df1 %>%
separate_rows(csi) %>%
group_by(x) %>%
summarise(csi_max = max(csi)) %>%
left_join(df1, .)
# x csi csi_max
#1 1 0.5, 6.7, 2.3 6.7
#2 2 9.5, 2.6, 1.1 9.5
#3 3 0.7, 2.3, 5.1 5.1
#4 4 4.1, 2.7, 4.7 4.7
Or this can be done with pmax
from base R
after separating the 'csi' column into a data.frame
with read.table
df1$csi_max <- do.call(pmax, read.table(text=df1$csi, sep=","))
Hope this helps!
df$csi_max <- sapply(df$csi, function(x) max(as.numeric(unlist(strsplit(as.character(x), split=",")))))
Output is:
x csi csi_max
1 1 0.5, 6.7, 2.3 6.7
2 2 9.5, 2.6, 1.1 9.5
3 3 0.7, 2.3, 5.1 5.1
4 4 4.1, 2.7, 4.7 4.7
#sample data
> dput(df)
structure(list(x = 1:4, csi = structure(c(1L, 4L, 2L, 3L), .Label = c("0.5, 6.7, 2.3",
"0.7, 2.3, 5.1", "4.1, 2.7, 4.7", "9.5, 2.6, 1.1"), class = "factor")), .Names = c("x",
"csi"), class = "data.frame", row.names = c(NA, -4L))
Edit:
As suggested by @RichScriven, the more efficient way could be
df$csi_max <- sapply(strsplit(as.character(df$csi), ","), function(x) max(as.numeric(x)))
A solution using the splitstackshape package.
library(splitstackshape)
dat$csi_max <- apply(cSplit(dat, "csi")[, -1], 1, max)
dat
# x csi csi_max
# 1 1 0.5, 6.7, 2.3 6.7
# 2 2 9.5, 2.6, 1.1 9.5
# 3 3 0.7, 2.3, 5.1 5.1
# 4 4 4.1, 2.7, 4.7 4.7
DATA
dat <- read.table(text = "x csi
1 '0.5, 6.7, 2.3'
2 '9.5, 2.6, 1.1'
3 '0.7, 2.3, 5.1'
4 '4.1, 2.7, 4.7'",
header = TRUE, stringsAsFactors = FALSE)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.