[英]R collapsing multiple rows into one row by grouping multiple columns
我想通過將多列而不是其他列分組來將多行折疊成一行。 我在不用於分組的列中有 NA。 在嘗試了多種解決方案后,結果表中充滿了 NA,沒有值。 我能夠使解決方案起作用,但前提是我使 is.na = 0。我不想將 0 引入 dataframe 因為一些測量結果為零。
這是R 的后續操作,將多行折疊為 1 行 -我嘗試了所有推薦的解決方案,數據結果為 NA
TreatName<-c('Static','Static','Dynamic', 'Static')
id<-c('patient1','patient1','patient2','patient2')
Method<-c('IV', 'IV', 'IV', 'IV')
drug1<-as.numeric(c(34,'','',''))
drug2<-as.numeric(c('',7,'',''))
drug3<-as.numeric(c('','',56, 0))
df<-data.frame(TreatName, id, Method, drug1, drug2, drug3)
library(plyr)
groupColumns = c("TreatName","id", "Method")
dataColumns = c( "drug1", "drug2","drug3")
df1<-ddply(df, groupColumns, function(x) colSums(x[dataColumns]))
The expected result should be
TreatName id Method drug1 drug2 drug3
Static patient1 IV 34 7 NA
Dynamic patient2 IV NA NA 56
Static patient2 IV NA NA 0
The actual results are
TreatName id Method drug1 drug2 drug3
Dynamic patient2 IV NA NA 56
Static patient1 IV NA NA NA
Static patient2 IV NA NA 0
I noticed if I change the na to zero
df[is.na(df)]<-0
then use the ddply function it works. But now I introduced zero when no measurement was taken.
Open to any solutions
這是dplyr
的一個選項
library(dplyr)
df %>%
group_by_at(groupColumns) %>%
summarise_at(vars(dataColumns), ~ if(all(is.na(.))) NA_real_
else na.omit(.))
# A tibble: 3 x 6
# Groups: TreatName, id [3]
# TreatName id Method drug1 drug2 drug3
# <fct> <fct> <fct> <dbl> <dbl> <dbl>
#1 Dynamic patient2 IV NA NA 56
#2 Static patient1 IV 34 7 NA
#3 Static patient2 IV NA NA 0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.