简体   繁体   中英

Convert factor that includes “.” to numeric

I'm using a dataset that has periods ( . ) in place of NA s. Right now, the column I'm looking at is a factor with levels 1 , 2 , and . . I'm trying to take a mean, and obviously, na.rm isn't working. I went back and cleaned the data by changing the periods to NAs ( pe94[pe94 == "."] <- NA ), and that appeared to work. However, mean can't take the mean of a factor, and when I convert the factor to a numeric, the NA s become 3 s. How can I get rid of this problem?

I also had similar issues (and other issues) converting factors into numbers for mathematical analysis. However, I found a fairly simple solution that seems to work. Hope this helps ...

#Script to convert factor data to numeric data without loss or alterations of values

#Samlpe data frame with factor variables represented by numbers 
factor.vector1<-factor(x=c(111,222,333,444,555))
thousands<-c("1,000","2,000","3,000","4,000","5,000")
factor.vector2<-factor(x=thousands)
df<-data.frame(factor.vector1, factor.vector2)

#Numbers as factors without comma place holders
#1st convert dataset to character data type
df[,1]<-as.character(df[,1])
#2nd convert dataset to numeric data type
df[,1]<-as.numeric(df[,1])

#Numbers as factors WITH comma place holders 
#If data contains commas in the numbers (e.g. 2,000) use gsub to remove commas
#If commas are not removed before conversion, the value containing commas will become NA
df[,2]<-gsub(",", "", df[,2])
#1st convert dataset to character data type
df[,2]<-as.character(df[,2])
#2nd convert dataset to numeric data type
df[,2]<-as.numeric(df[,2])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM