将列拆分为多列R.

Question

I have a data frame column that I need to split into 3 separate column. 我有一个数据框列，我需要拆分成3个单独的列。 Looks like this: 看起来像这样：

I:500-600
I:700-900
II:200-250

I'd like to split this into the following 3 columns: 我想将其拆分为以下3列：

This has proved slightly trickier than I had hoped. 事实证明这比我希望的要复杂一些。 Any help would be appreciated. 任何帮助，将不胜感激。

Answer 1

You can use strsplit with an OR argument splitting using : or - this will give you a list which you can process further. 您可以使用strsplit和OR参数拆分使用:或-这将为您提供一个可以进一步处理的列表。

> test <- c('I:500-600', 'I:700-900', 'II:200-250')
> do.call(rbind.data.frame, strsplit(test, ":|-"))
  c..I....I....II.. c..500....700....200.. c..600....900....250..
1                 I                    500                    600
2                 I                    700                    900
3                II                    200                    250

If names are important 如果名字很重要

> as.data.frame(do.call(rbind, strsplit(test, ":|-")))
  V1  V2  V3
1  I 500 600
2  I 700 900
3 II 200 250

Answer 2

Another solution with str_match from the stringr package: 与另一种溶液str_match从stringr包：

x <- c("I:500-600", "I:700-900", "II:200-250")
library(stringr)
as.data.frame(str_match(x, "^(.*):(.*)-(.*)$")[,-1])
##   V1  V2  V3
## 1  I 500 600
## 2  I 700 900
## 3 II 200 250

In the above regular expression we match 3 substrings: from the beginning to : , from : to - , and from - to the end. 在上面的正则表达式中，我们匹配3个子串：从开头到: ，从:到- ，从-到结尾。 Each matched substring will constitute a separate column in the resulting object. 每个匹配的子字符串将在结果对象中构成一个单独的列。

Answer 3

Other options include extract from tidyr 其他选项包括tidyr extract

library(tidyr)
extract(df1, V1, into=c('V1','V2', 'V3'),
            '([^:]*):([0-9]*)-([0-9]*)', convert=TRUE)
#  V1  V2  V3
#1  I 500 600
#2  I 700 900
#3 II 200 250

Or tstrsplit from data.table . 或tstrsplit从data.table 。

library(data.table)#v1.9.5+
setDT(df1)[, tstrsplit(V1, '[:-]', type.convert=TRUE)]
#   V1  V2  V3
#1:  I 500 600
#2:  I 700 900
#3: II 200 250

NOTE: Both options have arguments to convert the class of the output columns 注意：两个选项都有转换输出列类的参数

data 数据

df1 <- structure(list(V1 = c("I:500-600", "I:700-900", "II:200-250")), 
 .Names = "V1", class = "data.frame", row.names = c(NA, -3L))

Answer 4

I would recommend cSplit from my "splitstackshape" package. 我会从我的“splitstackshape”包中推荐cSplit 。

The syntax is pretty straightforward: cSplit(yourInputDataFrame, yourSplittingColumn, theDelimiters) . 语法非常简单： cSplit(yourInputDataFrame, yourSplittingColumn, theDelimiters) 。

Here's an example on a vector . 这是一个vector的例子。 You'd skip the data.table part if you already had a data.frame or a data.table . 如果您已经有data.frame或data.table则跳过data.table部分。

library(splitstackshape)
cSplit(data.table(x), "x", ":|-", fixed = FALSE)
#    x_1 x_2 x_3
# 1:   I 500 600
# 2:   I 700 900
# 3:  II 200 250

By default, it also runs type.convert : 默认情况下，它还运行type.convert ：

str(.Last.value)
# Classes ‘data.table’ and 'data.frame':    3 obs. of  3 variables:
#  $ x_1: Factor w/ 2 levels "I","II": 1 1 2
#  $ x_2: int  500 700 200
#  $ x_3: int  600 900 250
#  - attr(*, ".internal.selfref")=<externalptr>

将列拆分为多列R.

问题描述

4 个解决方案

解决方案1
6 2014-06-07 19:08:07

解决方案2
6 已采纳 2014-06-07 19:09:01

解决方案3
5 2015-04-19 05:33:52

data 数据

解决方案4
4 2015-04-19 04:16:48

将列拆分为多列R.

问题描述

4 个解决方案

解决方案1 6 2014-06-07 19:08:07

解决方案2 6 已采纳 2014-06-07 19:09:01

解决方案3 5 2015-04-19 05:33:52

data 数据

解决方案4 4 2015-04-19 04:16:48

解决方案1
6 2014-06-07 19:08:07

解决方案2
6 已采纳 2014-06-07 19:09:01

解决方案3
5 2015-04-19 05:33:52

解决方案4
4 2015-04-19 04:16:48