[英]split comma-separated column entry into rows
I have already found other versions of the same question but I was not able to adapt the answers given there for my problem.我已经找到了同一问题的其他版本,但我无法针对我的问题调整那里给出的答案。 Here is an older link:这是一个较旧的链接:
The op there had data consisting of two columns only - and the given answer handles this really nicely.那里的操作仅包含两列的数据-给定的答案非常好地处理了这个问题。 But what about more than two columns?但是超过两列呢? Is there a way to adapt the linked code snippet?有没有办法调整链接的代码片段?
Here is an example:这是一个例子:
ve <- rbind("4,2","3","1,2,3","5","6","7")
expl <- cbind(head(mtcars),ve)
row.names mpg cyl disp hp drat wt qsec vs am gear carb ve
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4,2
2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1,2,3
4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
I would need:我会需要:
row.names mpg cyl disp hp drat wt qsec vs am gear carb ve
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
2 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
3 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1
5 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2
6 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3
7 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
8 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
9 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
Thank you!谢谢!
Try unnest
from the tidyr
package.从tidyr
包中尝试unnest
。 My example uses dplyr
, but you can also accomplish with base functions.我的示例使用dplyr
,但您也可以使用基本功能来完成。
library(dplyr)
library(tidyr)
expl %>%
mutate(ve = strsplit(as.character(ve), ",")) %>%
unnest(ve)
Here's an attempt using base R only (which also preserves the row names- in a way at least...)这是仅使用基本 R 的尝试(它还保留了行名称 - 至少在某种程度上......)
ve <- strsplit(ve, ",")
Res <- expl[rep(seq_len(nrow(expl)), sapply(ve, length)), ]
Res$ve <- unlist(ve)
Res
# mpg cyl disp hp drat wt qsec vs am gear carb ve
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
# Mazda RX4.1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1
# Datsun 710.1 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2
# Datsun 710.2 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
Or using data.table
, one option is或使用data.table
,一种选择是
library(data.table)
setDT(expl)[,
strsplit(as.character(ve), ","),
c(names(expl)[-length(expl)])
]
Another option would be另一种选择是
setkey(expl, ve)[setDT(expl)[, strsplit(as.character(ve), ","), ve]]
I would recommend cSplit
from my "splitstackshape" package.我会从我的“splitstackshape”包中推荐cSplit
。
Since your example has rownames
, I've converted your example data to a data.table
with the keep.rownames = TRUE
argument.由于您的示例具有rownames
,因此我已使用keep.rownames = TRUE
参数将您的示例数据转换为data.table
。
library(splitstackshape)
cSplit(as.data.table(expl, keep.rownames = TRUE), "ve", ",", "long")
# rn mpg cyl disp hp drat wt qsec vs am gear carb ve
# 1: Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 4
# 2: Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
# 3: Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
# 4: Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 1
# 5: Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2
# 6: Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3
# 7: Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5
# 8: Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6
# 9: Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.