[英]R Regex / gsub : How to collapse spaces in a string
I have a vector of sentences that were scanned from handwritten documents. 我有一个从手写文件扫描的句子向量。 In the process there were some spacing problems like this:
在这个过程中有一些像这样的间距问题:
The d og is br own.
I was curious if there was a way to generically take any pattern with '_x_'
or space-character-space and collapse the second space like this: 我很好奇是否有办法通常使用
'_x_'
或空格字符空间采用任何模式并折叠第二个空格,如下所示:
The d og is br own. --> The dog is br own.
I'm only worried about a single character between the spaces ( '_x_'
NOT '_xx_'
). 我只担心空格之间的单个字符(
'_x_'
不'_xx_'
)。
Any suggestions? 有什么建议?
Maybe 也许
> x<-"The d og is br own."
> gsub(" (.) "," \\1",x)
[1] "The dog is br own."
or 要么
gsub(" ([[:alnum:]]) "," \\1",x)
(.)
matches anything ([[:alnum:]])
matches alphanumeric characters only. (.)
匹配任何东西([[:alnum:]])
仅匹配字母数字字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.