简体   繁体   English

R:删除'$'符号

[英]R: removing the '$' symbols

I have downloaded some data from a web server, including prices that are formatted for humans, including $ and thousand separators. 我从Web服务器下载了一些数据,包括为人类格式化的价格,包括$和千位分隔符。

> head(m)
[1] $129,900 $139,900 $254,000 $260,000 $290,000 $295,000

I was able to get rid of the commas, using 我能够摆脱使用的逗号

m <- sub(',','',m)

but

m <- sub('$','',m)

does not remove the dollar sign. 不会删除美元符号。 If I try mn <- as.numeric(m) or as.integer I get an error message: 如果我尝试mn <- as.numeric(m)或as.integer,我会收到一条错误消息:

Warning message: NAs introduced by coercion 警告信息:强制引入的NA

and the result is: 结果是:

> head(m)
[1] NA NA NA NA NA NA

How can I remove the $ sign? 如何删除$符号? Thanks 谢谢

 dat <- gsub('[$]','',dat)
 dat <- as.numeric(gsub(',','',dat))
 > dat
 [1] 129900 139900 254000 260000 290000 295000

In one step 一步到位

 gsub('[$]([0-9]+)[,]([0-9]+)','\\1\\2',dat)
[1] "129900" "139900" "254000" "260000" "290000" "295000"

Try this. 尝试这个。 It means replace anything that is not a digit with the empty string: 这意味着用空字符串替换任何不是数字的东西:

as.numeric(gsub("\\D", "", dat))

or to remove anything that is neither a digit nor a decimal: 或删除任何既不是数字也不是小数的东西:

as.numeric(gsub("[^0-9.]", "", dat))

UPDATE: Added a second similar approach in case the data in the question is not representative. 更新:如果问题中的数据不具代表性,则添加第二种类似方法。

you could also use: 你也可以用:

x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")

library(qdap)
as.numeric(mgsub(c("$", ","), "", x))

yielding: 收益:

> as.numeric(mgsub(c("$", ","), "", x))
[1] 129900 139900 254000 260000 290000 295000

If you wanted to stay in base use the fixed = TRUE argument to gsub: 如果你想继续使用gsub的fixed = TRUE参数:

x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")
as.numeric(gsub("$", "", gsub(",", "", x), fixed = TRUE))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM