簡體   English   中英

R:通過對ID進行分組來計算列中位數

[英]R: Calculate the column medians by grouping the ID's

繼續上一篇文章 ,現在我想按ID分組(僅適用於第3列),並計算列的中位數(Point_B),然后將列中的每個值(Point_B)減去其中位數。 NA仍應返回。

注意:我希望將ID分組僅應用於Point_B列,而不應用於Point_A,因為我想計算整個Point_A列的中位數,然后將其與Point_A中的值相減。

例如

ID <- c("A","A","A","B","B","B","C","C","C") 
Point_A <- c(1,2,NA,1,2,3,1,2,NA) 
Point_B <- c(1,2,3,NA,NA,1,1,1,3)

df <- data.frame(ID,Point_A ,Point_B)


+----+---------+---------+
| ID | Point_A | Point_B |
+----+---------+---------+
| A  | 1       | 1       |
| A  | 2       | 2       |
| A  | NA      | 3       |
| B  | 1       | NA      |
| B  | 2       | NA      |
| B  | 3       | 1       |
| C  | 1       | 1       |
| C  | 2       | 1       |
| C  | NA      | 3       |
+----+---------+---------+

我以前的帖子中提供的解決方案無需計算ID就可以計算中位數。 這里是

library(dplyr)
 df %>%
     mutate_each(funs(median=.-median(., na.rm=TRUE)), -ID)

期望的輸出

+----+---------+---------+
| ID | Point_A | Point_B |
+----+---------+---------+
| A  | -1      | -1      |
| A  | 0       | 0       |
| A  | NA      | 1       |
| B  | -1      | NA      |
| B  | 0       | NA      |
| B  | 1       | 0       |
| C  | -1      | 0       |
| C  | 0       | 0       |
| C  | NA      | 2       |
+----+---------+---------+

如何通過ID分組獲得Column3中的值?

我猜你想要一個group_by (遵循@docendodiscimus的建議):

demed <- function(x) x-median(x,na.rm=TRUE)

df %>% 
  mutate_each(funs(demed),Point_A) %>%
  group_by(ID) %>%  
  mutate_each(funs(demed),Point_B)

給予

  ID Point_A Point_B
1  A      -1      -1
2  A       0       0
3  A      NA       1
4  B      -1      NA
5  B       0      NA
6  B       1       0
7  C      -1       0
8  C       0       0
9  C      NA       2

我更喜歡類似的data.table代碼。 它的語法要求多次寫入變量名,但括號要少得多:

require(data.table)
DT <- data.table(df)

DT[,Point_A:=demed(Point_A)
][,Point_B:=demed(Point_B)
,by=ID]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM