简体   繁体   English

取出r中字符串空格中间的字符

[英]take out the characters in the middle between spaces of a string in r

I have several strings like我有几个字符串

"AAA BBB CCC 1X2L BOT BR, DDD EEE FFF 3X4L BOT BR, GGG 5X6L BOT BR" “AAA BBB CCC 1X2L BOT BR,DDD EEE FFF 3X4L BOT BR,GGG 5X6L BOT BR”

And I just want to take out the characters before the last last spaces, ie, I want我只想取出最后一个空格之前的字符,即我想要

"1X2L, 3X4L, 5X6L" “1X2L、3X4L、5X6L”

only.只要。

How can I reach this in R?如何在 R 中达到此目的?

You can try using sub after splitting the string on comma ( , ).您可以在用逗号 ( , ) 拆分字符串后尝试使用sub

x <- "AAA BBB CCC 1X2L BOT BR, DDD EEE FFF 3X4L BOT BR, GGG 5X6L BOT BR"
sub('.*?(\\w+)\\s\\w+\\s\\w+$', '\\1', strsplit(x, ',\\s')[[1]])
#[1] "1X2L" "3X4L" "5X6L"

.*? - matches as few characters as possible until - 匹配尽可能少的字符,直到

( (\\w+) - is a capture group to capture the word that we want ( (\\w+) - 是一个捕获组,用于捕获我们想要的单词

\\s - a whitespace followed by \\s - 后跟一个空格

\\w+ - a word followed by \\w+ - 一个单词,后跟

\\s - another whitespace and a word ( \\w+ ) is encountered.) \\s - 遇到另一个空格和一个单词( \\w+ )。)

Another regex you can use in this case在这种情况下可以使用的另一个正则表达式

library(stringr)
str_extract_all(x, "\\d{1}\\w{1}\\d{1}\\w{1}")
#[1] "1X2L" "3X4L" "5X6L"
  • \\d{1} : Matches one digit only \\d{1} :仅匹配一位数字
  • \\w{1} : Matches one letter only \\w{1} : 只匹配一个字母

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM