如何将一串数字和不同长度的字母分成R中的不同列？

Question

I have a column called 'WFBS' that has over a million rows of strings of different lengths that look like this: 我有一个名为'WFBS'的列，它有超过一百万行不同长度的字符串，如下所示：

WFBS <- c("M010203", "S01020304", "N104509")

and I need an output that looks like this: 我需要一个如下所示的输出：

WFBS1 <- c("M01", "S01", "N10")
WFBS2 <- c("02", "02", "45")
WFBS3 <- c("03", "03", "09")
WFBS4 <- c(NA, "04", NA)

So I need to separate each string in: first column: 3 characters (ie the letter followed by 2 digits) rest of the columns: 2 characters per column until I have no characters left 所以我需要将每个字符串分开：第一列：3个字符（即字母后跟2个数字）其余列：每列2个字符，直到我没有剩下字符

I tried using the function strsplit, but it says that my variables are not characters, so then I created a vector x as follows: 我尝试使用函数strsplit，但它说我的变量不是字符，所以我创建了一个向量x，如下所示：

x <- as.character(WFBS)

but then I don't know how to separate the string into columns with the function strsplit. 但后来我不知道如何使用函数strsplit将字符串分隔成列。

Answer 1

An option with base R bu creating a delimiter , using sub , read with read.csv to create a 4 column data.frame 使用sub创建分隔符的base R bu的选项,使用read.csv读取以创建4列data.frame

read.csv(text = sub("^(...)(..)(..)(.*)", "\\1,\\2,\\3,\\4", WFBS), 
  header = FALSE, colClasses = rep("character", 4), na.strings = "",
        col.names =paste0("WFBS", 1:4), stringsAsFactors = FALSE)
#    WFBS1 WFBS2 WFBS3 WFBS4
#1   M01    02    03  <NA>
#2   S01    02    03    04
#3   N10    45    09  <NA>

Answer 2

This might be a useful starting point: 这可能是一个有用的起点：

library(tidyr)
df <- data.frame(WFBS = c("M010203", "S01020304", "N104509"),
                 stringsAsFactors = FALSE)
> df %>% separate(col = WFBS,
                  into = c("WFBS1","WFBS2","WFBS3","WFBS4"),
                  sep = c(3,5,7))
  WFBS1 WFBS2 WFBS3 WFBS4
1   M01    02    03      
2   S01    02    03    04
3   N10    45    09

This leaves you with empty strings rather than NAs in the remainder spots, which you'd have to convert. 这会留下空字符串而不是剩余点中的NA，您必须转换它们。

如何将一串数字和不同长度的字母分成R中的不同列？

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-07-11 15:22:15

解决方案2
1 2019-07-11 15:20:53

如何将一串数字和不同长度的字母分成R中的不同列？

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-07-11 15:22:15

解决方案2 1 2019-07-11 15:20:53

解决方案1
2 已采纳 2019-07-11 15:22:15

解决方案2
1 2019-07-11 15:20:53