[英]How to remove all numbers and commas from a string except any number immediately preceded by $ using R?
I would like to remove all numbers and commas from a string except any number that is immediately preceded by $ and immediately followed by a comma. 我想从字符串中删除所有数字和逗号,除了任何紧跟$且紧跟逗号的数字。
For example, I have: 例如,我有:
str = "1, $100-$1,000 2, $1001-$10,000 3, $10,001-$100,000"
I would like to obtain the following: 我想获得以下内容:
"$100-$1,000 $1001-$10,000 $10,001-$100,000"
I have tried to use gsub
with a negative lookbehind 我试图将gsub
用作反面
new_str = gsub("(?<!\\$)[0-9]*,", "", str)
However, this gives the following error message: 但是,这给出了以下错误消息:
Error in gsub("(?<!\\$)[0-9]*,", "", str) : invalid regular expression '(<!\$)[0-9]*,', reason 'Invalid regexp'
It seems that the negative lookbehind is incorrectly coded, but I can't seem to figure out why. 负向后看似被错误地编码,但我似乎无法弄清楚为什么。 Any help is much appreciated! 任何帮助深表感谢!
1) This gives the desired answer in the case of the sample string: 1)对于示例字符串,这给出了所需的答案:
gsub("\\d+, ", "", str)
## [1] "$100-$1,000 $1001-$10,000 $10,001-$100,000"
Visualization of regular expression 可视化正则表达式
\d+,
2) Here is a second approach: 2)这是第二种方法:
library(gsubfn)
paste(strapplyc(str, "(\\$\\S+)", simplify = c), collapse = " ")
## [1] "$100-$1,000 $1001-$10,000 $10,001-$100,000"
Visualization of regular expression 可视化正则表达式
(\$\S+)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.