[英]Change NA-s more columns in a dataframe
i have a data frame(called hp) what contains more columns with NA-s.The classes of these columns are factor. 我有一个数据框(称为hp),其中包含更多带有NA-s的列。这些列的类别是因素。 First i want to change it to character, fill NA-s with "none" and change it back to factor.
首先,我想将其更改为字符,用“无”填充NA-s,然后将其更改回因数。 I have 14 columns and because of it i'd like to make it with loops.
我有14列,由于这个原因,我想使用循环。 But it doesnt work.
但它不起作用。
Thx for your help. 谢谢您的帮助。
The columns: 列:
miss_names<-c("Alley","MasVnrType","FireplaceQu","PoolQC","Fence","MiscFeature","GarageFinish", "GarageQual","GarageCond","BsmtQual","BsmtCond","BsmtExposure","BsmtFinType1",
"BsmtFinType2","Electrical")
The loop: 循环:
for (i in miss_names){
hp[i]<-as.character(hp[i])
hp[i][is.na(hp[i])]<-"NONE"
hp[i]<-as.factor(hp[i])
print(hp[i])
}
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
Use addNA()
to add NA
as a factor level and then replace that level with whatever you want. 使用
addNA()
将NA
添加为因子级别,然后将其替换为所需的值。 You don't have to turn the factors into a character vector first. 您不必首先将因子转换为字符向量。 You can loop over all the factors in the data frame and replace them one by one.
您可以遍历数据框中的所有因素,并一一替换。
# Sample data
dd <- data.frame(
x = sample(c(NA, letters[1:3]), 20, replace = TRUE),
y = sample(c(NA, LETTERS[1:3]), 20, replace = TRUE)
)
# Loop over the columns
for (i in seq_along(dd)) {
xx <- addNA(dd[, i])
levels(xx) <- c(levels(dd[, i]), "none")
dd[, i] <- xx
}
This gives us 这给了我们
> str(dd)
'data.frame': 20 obs. of 2 variables:
$ x: Factor w/ 4 levels "a","b","c","none": 1 4 1 4 4 1 4 3 3 3 ...
$ y: Factor w/ 4 levels "A","B","C","none": 1 1 2 2 1 3 3 3 4 1 ...
An alternative solution using the purrr library using the same data as @ Johan Larsson: 使用purrr库的替代解决方案,使用与@ Johan Larsson相同的数据:
library(purrr) set.seed(15) dd <- data.frame( x = sample(c(NA, letters[1:3]), 20, replace = TRUE), y = sample(c(NA, LETTERS[1:3]), 20, replace = TRUE)) # Create a function to convert NA to none convert.to.none <- function(x){ y <- addNA(x) levels(y) <- c(levels(x), "none") x <- y return(x) } # use the map function to cycle through dd's columns map_df(dd, convert.2.none)
Allows for scaling of your work. 可以扩展您的工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.