忽略函数中的NA值

Question

Im writing my own function to calculate the mean of a column in a data set and then applying it using apply() but it only returns the first columns mean. 我正在编写自己的函数来计算数据集中列的平均值，然后使用apply（）应用它，但它只返回第一列的意思。 Below is my code 以下是我的代码

mymean <- function(cleaned_us){
  column_total = sum(cleaned_us)
  column_length = length(cleaned_us)
  return (column_total/column_length)
}

Average_2 <- apply(numeric_clean_usnews,2,mymean,na.rm=T)

Answer 1

We need to use the na.rm=TRUE in the sum and using it in apply is not going to work as mymean doesn't have that argument 我们需要在sum使用na.rm=TRUE并且在apply使用它不会起作用，因为mymean没有那个参数

mymean <- function(cleaned_us){
   column_total = sum(cleaned_us, na.rm = TRUE) #change
   column_length = sum(!is.na(cleaned_us)) #change
  return(column_total/column_length)
 }

Note that colMeans can be used for getting the mean for each column. 请注意， colMeans可用于获取每列的mean 。

Answer 2

In order to pass an na.rm parameter to the function you defined, you need to make it a parameter of the function. 为了将na.rm参数传递给您定义的函数，您需要将其作为函数的参数。 The sum() function has an na.rm param, but length() doesn't. sum()函数有一个na.rm参数，但是length()没有。 So to write the function you are trying to write, you could say: 所以要编写你想写的函数，你可以说：

# include `na.rm` as a param of the argument 
mymean <- function(cleaned_us, na.rm){

  # pass it to `sum()` 
  column_total = sum(cleaned_us, na.rm=na.rm)

  # if `na.rm` is set to `TRUE`, then don't count `NA`s 
  if (na.rm==TRUE){
    column_length = length(cleaned_us[!is.na(cleaned_us)])

  # but if it's `FALSE`, just use the full length
  } else {
    column_length = length(cleaned_us)
  }

  return (column_total/column_length)
}

Then your call should work: 然后你的电话应该工作：

Average_2 <- apply(numeric_clean_usnews, 2, mymean, na.rm=TRUE)

Answer 3

Use na.omit() 使用na.omit()

set.seed(1)
m <- matrix(sample(c(1:9, NA), 100, replace=TRUE), 10)

mymean <- function(cleaned_us, na.rm){
    if (na.rm) cleaned_us <- na.omit(cleaned_us)
    column_total = sum(cleaned_us)
    column_length = length(cleaned_us)
    column_total/column_length
}

apply(m, 2, mymean, na.rm=TRUE)

# [1] 5.000 5.444 4.111 5.700 6.500 4.600 5.000 6.222 4.700 6.200

忽略函数中的NA值

问题描述

3 个解决方案

解决方案1
3 已采纳 2017-11-11 19:16:49

解决方案2
0 2017-11-11 19:18:53

解决方案3
0 2017-11-11 19:22:31

忽略函数中的NA值

问题描述

3 个解决方案

解决方案1 3 已采纳 2017-11-11 19:16:49

解决方案2 0 2017-11-11 19:18:53

解决方案3 0 2017-11-11 19:22:31

解决方案1
3 已采纳 2017-11-11 19:16:49

解决方案2
0 2017-11-11 19:18:53

解决方案3
0 2017-11-11 19:22:31