简体   繁体   English

使用 R 将大小不等的字符串列拆分为多列

[英]Splitting a string column with unequal size into multiple columns using R

There is a string column like this, in my data frame.在我的数据框中有一个这样的字符串列。

str=as.character(c("M 12; M 13","M 24", NA, "C 12; C 50; C 78"))
no=seq(1:4)
data.frame(no,str)

  no              str
1  1       M 12; M 13
2  2             M 24
3  3             <NA>
4  4 C 12; C 50; C 78

It has multiple values, separated by ";"它有多个值,以“;”分隔symbol.象征。 I need to split this into multiple columns (3 columns based on this example) as each column contains only one value of the string.我需要将其拆分为多列(基于此示例的 3 列),因为每一列仅包含字符串的一个值。 Will this be possible using R?这可以使用 R 吗?

This is a good occasion to make use of extra = merge argument of separate :这是一个很好的机会来利用extra = merge的说法separate

library(dplyr)
df %>% 
  separate(str, c('A', 'B', 'C'), sep= ";", extra = 'merge')
  no    A     B     C
1  1 M 12  M 13  <NA>
2  2 M 24  <NA>  <NA>
3  3 <NA>  <NA>  <NA>
4  4 C 12  C 50  C 78

You can use str_split :您可以使用str_split

library(tidyverse)

df <- data.frame(str = c("M 12; M 13","M 24", NA, "C 12; C 50; C 78"),
                 no = seq(1:4))

df %>%
  mutate(splits = str_split(str, "; ")) %>%
  unnest_wider(splits)

which gives:这使:

# A tibble: 4 x 5
  str                 no ...1  ...2  ...3 
  <chr>            <int> <chr> <chr> <chr>
1 M 12; M 13           1 M 12  M 13  <NA> 
2 M 24                 2 M 24  <NA>  <NA> 
3 <NA>                 3 <NA>  <NA>  <NA> 
4 C 12; C 50; C 78     4 C 12  C 50  C 78 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM