[英]R: removing the '$' symbols
I have downloaded some data from a web server, including prices that are formatted for humans, including $ and thousand separators. 我从Web服务器下载了一些数据,包括为人类格式化的价格,包括$和千位分隔符。
> head(m)
[1] $129,900 $139,900 $254,000 $260,000 $290,000 $295,000
I was able to get rid of the commas, using 我能够摆脱使用的逗号
m <- sub(',','',m)
but 但
m <- sub('$','',m)
does not remove the dollar sign. 不会删除美元符号。 If I try
mn <- as.numeric(m)
or as.integer I get an error message: 如果我尝试
mn <- as.numeric(m)
或as.integer,我会收到一条错误消息:
Warning message: NAs introduced by coercion
警告信息:强制引入的NA
and the result is: 结果是:
> head(m)
[1] NA NA NA NA NA NA
How can I remove the $ sign? 如何删除$符号? Thanks
谢谢
dat <- gsub('[$]','',dat)
dat <- as.numeric(gsub(',','',dat))
> dat
[1] 129900 139900 254000 260000 290000 295000
In one step 一步到位
gsub('[$]([0-9]+)[,]([0-9]+)','\\1\\2',dat)
[1] "129900" "139900" "254000" "260000" "290000" "295000"
Try this. 尝试这个。 It means replace anything that is not a digit with the empty string:
这意味着用空字符串替换任何不是数字的东西:
as.numeric(gsub("\\D", "", dat))
or to remove anything that is neither a digit nor a decimal: 或删除任何既不是数字也不是小数的东西:
as.numeric(gsub("[^0-9.]", "", dat))
UPDATE: Added a second similar approach in case the data in the question is not representative. 更新:如果问题中的数据不具代表性,则添加第二种类似方法。
you could also use: 你也可以用:
x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")
library(qdap)
as.numeric(mgsub(c("$", ","), "", x))
yielding: 收益:
> as.numeric(mgsub(c("$", ","), "", x))
[1] 129900 139900 254000 260000 290000 295000
If you wanted to stay in base use the fixed = TRUE
argument to gsub: 如果你想继续使用gsub的
fixed = TRUE
参数:
x <- c("$129,900", "$139,900", "$254,000", "$260,000", "$290,000", "$295,000")
as.numeric(gsub("$", "", gsub(",", "", x), fixed = TRUE))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.