简体   繁体   English

使用空格和大写字母分割字符串

[英]Split string using space and capital letter

I'm trying to split my string into multiple rows. 我正在尝试将我的字符串分成多行。 String looks like this: 字符串看起来像这样:

x <- c("C 10.1 C 12.4","C 12", "C 45.5 C 10")  

Code snippet: 程式码片段:

strsplit(x, "//s")[[3]]

Result: 结果:

"C 45.5 C 10"

Expected Output: Split string into multiple rows like this: 预期的输出:将字符串分成多行,如下所示:

"C 10.1"
"C 12.4"
"C 12"
"C 45.5"
"C 10" 

The question is how to split the string? 问题是如何分割字符串?

Clue: there is a space and then character which is "C" in our case. 提示:在我们的例子中,有一个空格,然后是“ C”字符。 Anyone who knows how to do it? 有谁知道该怎么做?

You may use 您可以使用

unlist(strsplit(x, "(?<=\\d)\\s+(?=C)", perl=TRUE))

Output: 输出:

[1] "C 10.1" "C 12.4" "C 12"   "C 45.5" "C 10" 

See the online R demo and a regex demo . 请参见在线R演示regex演示

The (?<=\\\\d)\\\\s+(?=C) regex matches 1 or more whitespace characters ( \\\\s+ ) that are immediately preceded with a digit ( (?<=\\\\d) ) and that are immediately followed with C . (?<=\\\\d)\\\\s+(?=C)正则表达式与1个或多个空格字符( \\\\s+ )匹配,这些字符紧跟数字( (?<=\\\\d) )并紧随其后其次是C

If C can be any uppercase ASCII letter, replace C with [AZ] . 如果C可以是任何大写ASCII字母,请将C替换为[AZ]

A somwhat more complicated expression but easier on the regex side: 更复杂的表达式,但在正则表达式方面更容易:

unlist(
  sapply(
    strsplit(x, " ?C"),
    function(x) {
      paste0("C", x[nzchar(x)])
    }
  )
)
"C 10.1" "C 12.4" "C 12"   "C 45.5" "C 10"  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM