[英]Splitting a string column with unequal size into multiple columns using R
There is a string column like this, in my data frame.在我的数据框中有一个这样的字符串列。
str=as.character(c("M 12; M 13","M 24", NA, "C 12; C 50; C 78"))
no=seq(1:4)
data.frame(no,str)
no str
1 1 M 12; M 13
2 2 M 24
3 3 <NA>
4 4 C 12; C 50; C 78
It has multiple values, separated by ";"它有多个值,以“;”分隔symbol.象征。 I need to split this into multiple columns (3 columns based on this example) as each column contains only one value of the string.我需要将其拆分为多列(基于此示例的 3 列),因为每一列仅包含字符串的一个值。 Will this be possible using R?这可以使用 R 吗?
This is a good occasion to make use of extra = merge
argument of separate
:这是一个很好的机会来利用extra = merge
的说法separate
:
library(dplyr)
df %>%
separate(str, c('A', 'B', 'C'), sep= ";", extra = 'merge')
no A B C
1 1 M 12 M 13 <NA>
2 2 M 24 <NA> <NA>
3 3 <NA> <NA> <NA>
4 4 C 12 C 50 C 78
You can use str_split
:您可以使用str_split
:
library(tidyverse)
df <- data.frame(str = c("M 12; M 13","M 24", NA, "C 12; C 50; C 78"),
no = seq(1:4))
df %>%
mutate(splits = str_split(str, "; ")) %>%
unnest_wider(splits)
which gives:这使:
# A tibble: 4 x 5
str no ...1 ...2 ...3
<chr> <int> <chr> <chr> <chr>
1 M 12; M 13 1 M 12 M 13 <NA>
2 M 24 2 M 24 <NA> <NA>
3 <NA> 3 <NA> <NA> <NA>
4 C 12; C 50; C 78 4 C 12 C 50 C 78
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.