简体   繁体   English

更改数据框中的NA-s更多列

[英]Change NA-s more columns in a dataframe

i have a data frame(called hp) what contains more columns with NA-s.The classes of these columns are factor. 我有一个数据框(称为hp),其中包含更多带有NA-s的列。这些列的类别是因素。 First i want to change it to character, fill NA-s with "none" and change it back to factor. 首先,我想将其更改为字符,用“无”填充NA-s,然后将其更改回因数。 I have 14 columns and because of it i'd like to make it with loops. 我有14列,由于这个原因,我想使用循环。 But it doesnt work. 但它不起作用。

Thx for your help. 谢谢您的帮助。

The columns: 列:

miss_names<-c("Alley","MasVnrType","FireplaceQu","PoolQC","Fence","MiscFeature","GarageFinish",       "GarageQual","GarageCond","BsmtQual","BsmtCond","BsmtExposure","BsmtFinType1",
          "BsmtFinType2","Electrical")

The loop: 循环:

for (i in miss_names){       
    hp[i]<-as.character(hp[i])
    hp[i][is.na(hp[i])]<-"NONE"
    hp[i]<-as.factor(hp[i])
    print(hp[i])
    }

 Error in sort.list(y) : 'x' must be atomic for 'sort.list'
 Have you called 'sort' on a list? 

Use addNA() to add NA as a factor level and then replace that level with whatever you want. 使用addNA()NA添加为因子级别,然后将其替换为所需的值。 You don't have to turn the factors into a character vector first. 您不必首先将因子转换为字符向量。 You can loop over all the factors in the data frame and replace them one by one. 您可以遍历数据框中的所有因素,并一一替换。

# Sample data
dd <- data.frame(
  x = sample(c(NA, letters[1:3]), 20, replace = TRUE),
  y = sample(c(NA, LETTERS[1:3]), 20, replace = TRUE)
)

# Loop over the columns
for (i in seq_along(dd)) {
  xx <- addNA(dd[, i])
  levels(xx) <- c(levels(dd[, i]), "none")
  dd[, i] <- xx
}

This gives us 这给了我们

> str(dd)
'data.frame':   20 obs. of  2 variables:
 $ x: Factor w/ 4 levels "a","b","c","none": 1 4 1 4 4 1 4 3 3 3 ...
 $ y: Factor w/ 4 levels "A","B","C","none": 1 1 2 2 1 3 3 3 4 1 ...

An alternative solution using the purrr library using the same data as @ Johan Larsson: 使用purrr库的替代解决方案,使用与@ Johan Larsson相同的数据:

 library(purrr) set.seed(15) dd <- data.frame( x = sample(c(NA, letters[1:3]), 20, replace = TRUE), y = sample(c(NA, LETTERS[1:3]), 20, replace = TRUE)) # Create a function to convert NA to none convert.to.none <- function(x){ y <- addNA(x) levels(y) <- c(levels(x), "none") x <- y return(x) } # use the map function to cycle through dd's columns map_df(dd, convert.2.none) 

Allows for scaling of your work. 可以扩展您的工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM