更改数据框中的NA-s更多列

Question

i have a data frame(called hp) what contains more columns with NA-s.The classes of these columns are factor. 我有一个数据框（称为hp），其中包含更多带有NA-s的列。这些列的类别是因素。 First i want to change it to character, fill NA-s with "none" and change it back to factor. 首先，我想将其更改为字符，用“无”填充NA-s，然后将其更改回因数。 I have 14 columns and because of it i'd like to make it with loops. 我有14列，由于这个原因，我想使用循环。 But it doesnt work. 但它不起作用。

Thx for your help. 谢谢您的帮助。

The columns: 列：

miss_names<-c("Alley","MasVnrType","FireplaceQu","PoolQC","Fence","MiscFeature","GarageFinish",       "GarageQual","GarageCond","BsmtQual","BsmtCond","BsmtExposure","BsmtFinType1",
          "BsmtFinType2","Electrical")

The loop: 循环：

for (i in miss_names){       
    hp[i]<-as.character(hp[i])
    hp[i][is.na(hp[i])]<-"NONE"
    hp[i]<-as.factor(hp[i])
    print(hp[i])
    }

 Error in sort.list(y) : 'x' must be atomic for 'sort.list'
 Have you called 'sort' on a list?

Answer 1

Use addNA() to add NA as a factor level and then replace that level with whatever you want. 使用addNA()将NA添加为因子级别，然后将其替换为所需的值。 You don't have to turn the factors into a character vector first. 您不必首先将因子转换为字符向量。 You can loop over all the factors in the data frame and replace them one by one. 您可以遍历数据框中的所有因素，并一一替换。

# Sample data
dd <- data.frame(
  x = sample(c(NA, letters[1:3]), 20, replace = TRUE),
  y = sample(c(NA, LETTERS[1:3]), 20, replace = TRUE)
)

# Loop over the columns
for (i in seq_along(dd)) {
  xx <- addNA(dd[, i])
  levels(xx) <- c(levels(dd[, i]), "none")
  dd[, i] <- xx
}

This gives us 这给了我们

> str(dd)
'data.frame':   20 obs. of  2 variables:
 $ x: Factor w/ 4 levels "a","b","c","none": 1 4 1 4 4 1 4 3 3 3 ...
 $ y: Factor w/ 4 levels "A","B","C","none": 1 1 2 2 1 3 3 3 4 1 ...

Answer 2

An alternative solution using the purrr library using the same data as @ Johan Larsson: 使用purrr库的替代解决方案，使用与@ Johan Larsson相同的数据：

 library(purrr) set.seed(15) dd <- data.frame( x = sample(c(NA, letters[1:3]), 20, replace = TRUE), y = sample(c(NA, LETTERS[1:3]), 20, replace = TRUE)) # Create a function to convert NA to none convert.to.none <- function(x){ y <- addNA(x) levels(y) <- c(levels(x), "none") x <- y return(x) } # use the map function to cycle through dd's columns map_df(dd, convert.2.none)

Allows for scaling of your work. 可以扩展您的工作。

更改数据框中的NA-s更多列

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-03-13 06:09:40

解决方案2
0 2017-03-14 11:32:37

更改数据框中的NA-s更多列

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-03-13 06:09:40

解决方案2 0 2017-03-14 11:32:37

解决方案1
1 已采纳 2017-03-13 06:09:40

解决方案2
0 2017-03-14 11:32:37