简体   繁体   English

防止将数字转换为因子

[英]Prevent a numeric from being converted into a factor

I'm building a table from a CSV file. 我正在从CSV文件构建表格。 When the file is initially loaded I need to load as characters. 最初加载文件时,我需要加载为字符。

datset <- read.csv("outcome-of-care-measures.csv", colClasses = "character")

I have function to convert a factor containing number (from other stack q) 我具有转换包含数字的因子的功能(从其他堆栈q)

as.numeric.factor <- function(x) {as.numeric(levels(x))[x]}

I clean up the file with 我用清理文件

i<-17
datset[datset=="Not Available"]<-NA
datset<-datset[complete.cases(datset[,i]),]
x<- as.numeric.factor(datset[, i])

The datset table contains lots of columns I don't need so I build a new table : datset表包含许多我不需要的列,因此我建立了一个新表:

dat <- data.frame(cbind("HospitalName"= datset[,2], "State"= datset[,7],"Rating" = x))                        

My problem is that even though x is numeric, it gets turned into a factor when loaded to the dataframe. 我的问题是,即使x是数字,但在加载到数据框时也会变成一个因素。 I can verify this from debug mode with : 我可以使用以下方式从调试模式验证这一点:

class(x)
"Numeric"

class(dat[,3])
"Factor"

In later code I'm trying to sort the Rating column but it's failing due it being a factor - I guess. 在以后的代码中,我试图对“评级”列进行排序,但是由于这是一个因素,因此无法通过-我猜。

I've even tried appending stringsAsFactors = FALSE to read.csv but this has no effect. 我什至尝试将stringsAsFactors = FALSE附加到read.csv但这没有效果。

How can I prevent x from being converted into a factor when loading to a DF? 加载到DF时如何防止x转换为因子?

As Henrik explained in his comment, this: 正如Henrik在评论中解释的那样:

dat <- data.frame(cbind("HospitalName"= datset[,2], "State"= datset[,7],"Rating" = x))

is a poor way to construct a data frame. 这是构造数据帧的不良方法。 cbind converts everything to a matrix, which can only hold a single data type. cbind将所有内容转换为一个矩阵,该矩阵只能容纳一个数据类型。 Hence the coercion. 因此,强制。

It would be better to do: 最好这样做:

dat <- data.frame(HospitalName = dataset[,2],state = dataset[,7],rating = x)

However, it is also true as Roland mentioned that you should be able to specify this one column to be numeric when reading the data in via: 但是,正如Roland提到的那样,当通过以下方式读取数据时,您应该能够将这一列指定为数字,这也是对的:

colclasses <- rep("character", 40)
colclasses[7] <- "numeric"

and then passing that in read.csv . 然后将其传递给read.csv

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM