简体   繁体   English

使用gsub()从R中的字母之间删除多余的空格

[英]Remove extra white space from between letters in R using gsub()

There are a slew of answers out there on how to remove extra whitespace from between words, which is super simple. 关于如何从单词之间删除多余的空格有很多答案,这非常简单。 However, I'm finding that removing extra whitespace within words is much harder. 然而,我发现,字 ,除去多余的空白是更难。 As a reproducible example, let's say I have a vector of data that looks like this: 作为可重现的示例,假设我有一个数据向量,看起来像这样:

x <- c("LLC", "PO BOX 123456", "NEW YORK")

What I'd like to do is something like this: 我想做的是这样的:

y <- gsub("(\\\\w)(\\\\s)(\\\\w)(\\\\s)", "\\\\1\\\\3", x)

But that leaves me with this: 但这让我有了:

[1] "LLC" "POBOX 123456" "NEW YORK"

Almost perfect, but I'd really like to have that second value say "PO BOX 123456". 几乎是完美的,但我真的很想让第二个值说“ PO BOX 123456”。 Is there a better way to do this than what I'm doing? 有比我正在做的更好的方法吗?

You may try this, 你可以试试看

> x <- c("L L C", "P O BOX 123456", "NEW YORK")
> gsub("(?<=\\b\\w)\\s(?=\\w\\b)", "", x,perl=T)
[1] "LLC"           "PO BOX 123456" "NEW YORK" 

It just removes the space which exists between two single word characters. 它只是删除两个单词字符之间存在的空格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM