简体   繁体   English

as.numeric正在舍入值

[英]as.numeric is rounding off values

I am trying to convert a character column from a data frame to the numerics. 我试图将字符列从数据框转换为数字。 However, what I am getting as a result are rounded up values. 但是,我得到的结果是四舍五入的值。

Whatever I have tried by researching other questions of the same nature on SO, hasn't worked for me. 无论我通过在SO上研究其他相同性质的问题而尝试过,对我来说都没有用。 I have checked the class of the column vector I am trying to convert, and it is a character, not a factor. 我已经检查了我想要转换的列向量的类,它是一个字符,而不是一个因素。

Here is my code snippet: 这是我的代码片段:

some_data <- read.csv("file.csv", nrows = 100, colClasses = c("factor", "factor", "character", "character"))
y <- Vectorize(function(x) gsub("[^\\.\\d]", "", x, perl = TRUE))
some_data$colC <- y(data1$colC)
data1$colD <- y(data1$colCD)

data1$colC <- as.numeric(data1$colC)
data1$colD <- as.numeric(data1$colD)

Edit: 编辑:

> dput(head(data1))
structure(list(colA = structure(c(2L, 2L, 5L, 6L, 5L, 6L), .Label = c("(Other)",
"Direct", "Display", "Email", "Organic Search", "Paid Search", 
"Referral", "Social Network"), class = "factor"), colB = structure(c(1L, 
2L, 2L, 2L, 1L, 1L), .Label = c("No", "Yes"), class = "factor"), 
colC = c("4023107.87", "3180863.42", "2558777.81", "2393736.25", 
"1333148.48", "1275627.13"), colD = c("49731596.35", "33604210.26", 
"20807573.12", "20061467.30", "10488358.77", "10442249.09"
)), .Names = c("colA", "colB", "colC", "colD"), row.names = c(NA, 
6L), class = "data.frame")

I think this is a representation problem, not an actual rounding problem ... 我认为这是一个表示问题,而不是一个实际的舍入问题......

options("digits") ## 7

From ?options : 来自?options

'digits': controls the number of digits to print when printing numeric values. 'digits':控制打印数值时要打印的位数。 It is a suggestion only. 这只是一个建议。 Valid values are 1...22 with default 7. See the note in 'print.default' about values greater than 15. 有效值为1 ... 22,默认值为7.请参阅“print.default”中有关大于15的值的注释。

digits can be reset either on a one-off basis, ie print(object,digits=...) , or globally, ie options(digits=20) (20 is probably overkill but helps you see what's happening: based on the results below, 10 might serve your needs well.) digits可以一次性重置,即print(object,digits=...) ,或全局,即options(digits=20) (20可能是矫枉过正,但可以帮助您了解发生了什么:根据结果在下面,10可能会很好地满足您的需求。)

as.numeric(data1$colC)
[1] 4023108 3180863 2558778 2393736 1333148 1275627
print(as.numeric(data1$colC),digits=10)
[1] 4023107.87 3180863.42 2558777.81 2393736.25 1333148.48 1275627.13
print(as.numeric(data1$colC),digits=20)
[1] 4023107.8700000001118 3180863.4199999999255 2558777.8100000000559
[4] 2393736.2500000000000 1333148.4799999999814 1275627.1299999998882

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM