使用R从数据框中删除NA值

Question

I have a large scale data frame with ?_? 我有一个大规模的数据框架？_？ values which dimensions are 501 rows and 42844 columns. 维度为501行和42844列的值。 Using R , i have already replaced them with NA by using this code below : 使用R，我已经使用以下代码用NA替换它们：

data[data == "?_?"] <- NA

So i have NA values now and I want to omit these from the Data.frame but something is going bad.... When I hit the command below : 所以我现在有NA值，我想从Data.frame中省略这些值，但是有些事情变得很糟糕......当我点击下面的命令：

data_na_rm <- na.omit(data)

I get a 0 , 42844 object as a result. 结果我得到一个0,42844对象。

dim(data_na_rm) #gives me 0 42844
data_na_rm[1,2] #gives me NA
data_na_rm[5,3] #gives me NA
############################
data_na_rm[2]   #gives me the title of the second column 
data_na_rm[5]   #gives me the title fo the fifth

What i have to do?? 我该怎么办？ I've spend on this thing to many hours. 我花了很多时间在这件事上。 I would appreciate if anyone could spend some time for this in order to help me. 如果有人能花一些时间来帮助我，我将不胜感激。

Answer 1

Like what JackStat said in the comments, you might have NAs in every row. 就像JackStat在评论中所说的那样，你可能每行都有NAs。 Maybe you should test for that?: 也许你应该测试一下？：

    # Some Data. All rows have an NA but not all columns

    df <- data.frame(col1 = c(NA, 2, 3, 4),
             col2 = c(1, NA, 3, 4),
             col3 = c(1, 2, NA, 4),
             col4 = c(1, 2, 3, NA),
             col5 = c(1, 2, 3, 4))

# test whether an NA is present in each row

apply(df, 1, function(x) {sum(is.na(x)) > 0})
[1] TRUE TRUE TRUE TRUE

This will help you find which columns are contributing the most NAs. 这将帮助您找到哪些列贡献最多的NA。 It sums up the number of NAs: 它总结了NA的数量：

apply(df, 2, function(x) {sum(is.na(x))})
col1 col2 col3 col4 col5 
   1    1    1    1    0

使用R从数据框中删除NA值

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-11-18 02:32:21

使用R从数据框中删除NA值

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-11-18 02:32:21

解决方案1
0 已采纳 2016-11-18 02:32:21