简体   繁体   English

在 R 中将所有列从因子更改为数字

[英]Change all columns from factor to numeric in R

I am working with a big dataset that is causing some trouble because some of the columns I the dataset are being treated as factors.我正在处理一个导致一些麻烦的大数据集,因为数据集的某些列被视为因素。 How can I convert all of the columns from factor to numeric, without having to do that column by column??如何将所有列从因子转换为数字,而不必逐列执行该列?

I have tried to apply a small loop, but it returns NA values.我试图应用一个小循环,但它返回 NA 值。 Here's a sample data that applies to the case:以下是适用于该案例的示例数据:

data <- structure(list(v1 = c(22.394, 43.72, 58.544, 56.877, 1.659, 29.142, 
67.836, 68.851), v2 = c(144.373, 72.3, 119.418, 112.429, 35.779, 
41.661, 166.941, 126.548), v3 = structure(c(33L, 29L, 33L, 5L, 
13L, 31L, 5L, 8L), .Label = c("", "#VALUE!", "0", "1", "10", 
"11", "12", "13", "14", "15", "16", "17", "18", "19", "2", "20", 
"21", "22", "23", "24", "25", "26", "28", "29", "3", "30", "32", 
"33", "4", "48", "5", "6", "7", "8", "9"), class = "factor"), 
    v4 = structure(c(24L, 6L, 22L, 23L, 16L, 22L, 23L, 26L), .Label = c("", 
    "-1", "-2", "-4", "#VALUE!", "0", "1", "10", "11", "12", 
    "13", "14", "15", "16", "17", "18", "19", "2", "24", "28", 
    "29", "3", "4", "5", "6", "7", "8", "9"), class = "factor")), .Names = c("v1", 
"v2", "v3", "v4"), row.names = c("4", "5", "6", "7", "8", "9", 
"10", "11"), class = "data.frame")

for (i in 1:ncol(data)){
data[,i] <- as.numeric(as.character(data[i]))
} ## returns NAs

Is there some command that I can apply to turn all these columns into a numeric class?是否有一些命令可用于将所有这些列转换为数字类?

This works but I'm thinking your data has an odd character or space, something that makes it read in as factor.这有效,但我认为您的数据有一个奇怪的字符或空格,这使它作为因素被读入。 You can try reading in with the argument stringsAsFactors = FALSE .您可以尝试使用参数stringsAsFactors = FALSE读入。 But still wouldn't address character vs numeric read in. Here's a fix:但仍然不会解决字符与数字读入的问题。这是一个修复:

data[] <- lapply(data, function(x) as.numeric(as.character(x)))

## > str(data)
## 'data.frame':   8 obs. of  4 variables:
##  $ v1: num  22.39 43.72 58.54 56.88 1.66 ...
##  $ v2: num  144.4 72.3 119.4 112.4 35.8 ...
##  $ v3: num  7 4 7 10 18 5 10 13
##  $ v4: num  5 0 3 4 18 3 4 7

You may be trying to solve the wrong problem, or solve the problem at the wrong place.您可能试图解决错误的问题,或者在错误的地方解决问题。 Often the reason that a column that you think is numeric is read in as a factor is because there are characters where numbers should be in the original data.通常,将您认为是数字的列作为因子读入的原因是因为原始数据中存在数字应包含的字符。 Converting these to numbers will result in a missing value instead of the intended number (which is better than the wrong number).将这些转换为数字将导致缺失值而不是预期的数字(这比错误的数字要好)。 It may be best to fix the original source of the data so that it is read in correctly.最好修复数据的原始来源,以便正确读入。

The next option is to use the colClasses argument to read.table and related functions to specify that the columns should be numeric and the conversion will take place automatically.下一个选项是使用colClasses参数read.table和相关函数来指定列应该是数字,转换将自动进行。 This can even be used (with a couple more steps) to convert "numbers" with "$", "%", or "," in them somewhere.这甚至可以用于(通过更多步骤)将“数字”转换为“$”、“%”或“,”。

If these don't work for you and you want to convert the existing data frame then here is one approach:如果这些对您不起作用并且您想转换现有的数据框,那么这里是一种方法:

w <- which( sapply( mydf, class ) == 'factor' )
mydf[w] <- lapply( mydf[w], function(x) as.numeric(as.character(x)) )

I accomplish this by simply writing the data frame and reading it back specifiying all columns are numeric.我通过简单地写入数据框并读取它来实现这一点,指定所有列都是数字。 I use data.table package, but it applies to basic read/write functions as well.我使用 data.table 包,但它也适用于基本的读/写功能。

library(data.table)
fwrite(dfm,"some.name.temp")
dfm <- fread("some.name.temp",colClasses="numeric")

#VALUE! seems to be the odd character;似乎是奇怪的字符; if so, telling R that this should be treated as missing by using the na.string argument is probably the way to go.如果是这样,通过使用na.string参数告诉 R 这应该被视为缺失可能是要走的路。

read.table(..., na.string="#VALUE!")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM