R strip在数据帧中分割一列

Question

I have a 'data' frame, with multiple columns, one of them being 'Runtime' which has data in two formats: 我有一个'数据'框架，有多列，其中一个是'Runtime'，它有两种格式的数据：

Runtime
1 h 10 min
67 min
1 h 0 min
86 min
97 min

I want to convert all of them into Minutes. 我想将它们全部转换成分钟。 Have tried 'strsplit' and 'strip_split_fixed'. 尝试'strsplit'和'strip_split_fixed'。 Can anyone show me a way to achieve my goal, split or any other method? 谁能告诉我一个实现目标，分裂或任何其他方法的方法？

Thank you in advance ! 先感谢您！

Answer 1

I think I saw this kind of solution somewhere. 我想我在某个地方看到了这种解决方案。 Don't hit me. 不要打我。

df = data.frame(Runtime = c('1 h 10 min', '67 min', '1 h 0 min', '86 min', '97 min'))

df$exp <- gsub("h", "* 60 +", df$Runtime)
df$exp <- gsub("min", "* 1", df$exp)

sapply(df$exp, FUN = function(x) eval(parse(text = x)))

1 * 60 + 10 * 1          67 * 1  1 * 60 + 0 * 1          86 * 1          97 * 1 
             70              67              60              86              97

Answer 2

You can get it one call using gsubfn and regex: 您可以使用gsubfn和regex进行一次调用：

library(gsubfn)
gsubfn("^(?:(\\d+)\\s*h)?\\s*(\\d+)\\s*min.*$",
 ~ sum(as.numeric(x) * 60, as.numeric(y), as.numeric(z), na.rm=TRUE), x)
#[1] "70" "67" "60" "86" "97"

Answer 3

Here's an example of how you can do it: 这是一个如何做到这一点的例子：

# setting up your data.frame of interest
df = data.frame(Runtime = c('1 h 10 min', '67 min', '1 h 0 min', '86 min', '97 min'))



df$Runtime = gsub(' min', '', df$Runtime) # remove the min labels
hrs = grepl('h', x = df$Runtime) # which values are in an "x h y min" format?
runtime_sub = sapply(strsplit(df[hrs, 'Runtime'], ' h '), function(i) sum(as.numeric(i) * c(60, 1))) # convert the "x h y min" entries into numeric values in minutes
df$Runtime = as.numeric(df$Runtime) # convert the vector to numeric (yes, it's supposed to return a warning. Ignore it.
df[hrs, 'Runtime'] = runtime_sub # add the converted values

This results in: 这导致：

Answer 4

1) Read df[[1]] and if the third column is NA then the first column gives the minutes; 1）读取df[[1]] ，如果第三列是NA，则第一列给出分钟; otherwise, 60 times the first column plus the third column gives the minutes: 否则，第一列加上第三列的60倍给出分钟：

with(read.table(text = as.character(df[[1]]), fill = TRUE), 
        ifelse(is.na(V3), V1, 60*V1 + V3))
## [1] 70 67 60 86 97

2) A variation is to paste "0 h" at the beginning of each component that does not have an h giving hm and read that computing 60 times the first column plus the third column. 2）一种变化是在每个没有给出hm组件的开头粘贴“ hm并读取计算第一列加第三列的60倍。

hm <- paste(ifelse(grepl("h", df[[1]]), "", "0 h"), df[[1]])
with(read.table(text = hm), 60 * V1 + V3)
## [1] 70 67 60 86 97

R strip在数据帧中分割一列

问题描述

4 个解决方案

解决方案1
8 2016-10-01 13:04:57

解决方案2
2 2016-10-01 13:24:00

解决方案3
0 2016-10-01 12:50:29

解决方案4
0 2016-10-03 12:32:10

R strip在数据帧中分割一列

问题描述

4 个解决方案

解决方案1 8 2016-10-01 13:04:57

解决方案2 2 2016-10-01 13:24:00

解决方案3 0 2016-10-01 12:50:29

解决方案4 0 2016-10-03 12:32:10

解决方案1
8 2016-10-01 13:04:57

解决方案2
2 2016-10-01 13:24:00

解决方案3
0 2016-10-01 12:50:29

解决方案4
0 2016-10-03 12:32:10