使用NA计算列中值

Question

I am trying to calculate the median of individual columns in R and then subtract the median value with every value in the column. 我试图计算R中各列的中位数，然后用列中的每个值减去中值。 The problem that I face here is I have N/A's in my column that I dont want to remove but just return them without subtracting the median. 我在这里遇到的问题是我在我的专栏中有N / A，我不想删除但只返回它们而不减去中位数。 For example 例如

ID <- c("A","B","C","D","E") 
Point_A <- c(1, NA, 3, NA, 5) 
Point_B <- c(NA, NA, 1, 3, 2)

df <- data.frame(ID,Point_A ,Point_B)

Is it possible to calculate the median of a column having N/A's? 是否可以计算具有N / A的柱的中值？ My resulting output would be 我的结果是

+----+---------+---------+
| ID | Point_A | Point_B |
+----+---------+---------+
| A  | -2      | NA      |
| B  | NA      | NA      |
| C  | 0       | -1      |
| D  | NA      | 1       |
| E  | 2       | 0       |
+----+---------+---------+

Answer 1

If we talking real NA values (as per OPs comment), one could do 如果我们谈论真正的NA值（根据OP评论），可以做到

df[-1] <- lapply(df[-1], function(x) x - median(x, na.rm = TRUE))
df
#   ID Point_A Point_B
# 1  A      -2      NA
# 2  B      NA      NA
# 3  C       0      -1
# 4  D      NA       1
# 5  E       2       0

Or using the matrixStats package 或者使用matrixStats包

library(matrixStats)
df[-1] <- df[-1] - colMedians(as.matrix(df[-1]), na.rm = TRUE)

When original df is 当原始df是

df <- structure(list(ID = structure(1:5, .Label = c("A", "B", "C", 
"D", "E"), class = "factor"), Point_A = c(1, NA, 3, NA, 5), Point_B = c(NA, 
NA, 1, 3, 2)), .Names = c("ID", "Point_A", "Point_B"), row.names = c(NA, 
-5L), class = "data.frame")

Answer 2

Another option is 另一种选择是

library(dplyr)
 df %>%
     mutate_each(funs(median=.-median(., na.rm=TRUE)), -ID)

Answer 3

Of course it is possible. 当然有可能。

median(df[,]$Point_A, na.rm = TRUE)

where df is the data frame, while df[,] means for all rows and columns. 其中df是数据框，而df [，]表示所有行和列。 But, be aware that the column the specified afterwards by $Point_A. 但是，请注意$ Point_A之后指定的列。 The same could be written in this notation: 同样可以用这种表示法写成：

median(df[,"Point_A"], na.rm = TRUE)

where once again, df[,"Point_A"] means for all rows of the column Point_A. 再次，df [，“Point_A”]表示列Point_A的所有行。

使用NA计算列中值

问题描述

3 个解决方案

解决方案1
6 2015-04-29 20:48:06

解决方案2
4 已采纳 2015-04-29 20:57:22

解决方案3
0 2015-04-29 20:48:13

使用NA计算列中值

问题描述

3 个解决方案

解决方案1 6 2015-04-29 20:48:06

解决方案2 4 已采纳 2015-04-29 20:57:22

解决方案3 0 2015-04-29 20:48:13

解决方案1
6 2015-04-29 20:48:06

解决方案2
4 已采纳 2015-04-29 20:57:22

解决方案3
0 2015-04-29 20:48:13