簡體   English   中英

從R數據幀中刪除重復和空白

[英]Removing repeats and blanks from R data frame

我在這里為數據結構提前道歉,但我堅持下去......

我有一個包含大量重復和空白的數據框,如下所示:

df <- data.frame(
country=c("Afghanistan", "Afghanistan", "Algeria", "Australia", "Australia", "Australia"), 
survey.1=c("Influenza","", "","","Influenza","Influenza"), 
survey.2=c("","Hepatitis C","","","",""), 
survey.3=c("West Nile Virus", "", "", "", "", "West Nile Virus"))

      country  survey.1    survey.2        survey.3
1 Afghanistan Influenza             West Nile Virus
2 Afghanistan           Hepatitis C                
3     Algeria                                      
4   Australia                                      
5   Australia Influenza                            
6   Australia Influenza             West Nile Virus

我需要刪除重復和空白但保持相同的數據結構(我不知道你會稱之為什么......'集中'而不是'聚合'可能?)。 所以我最終得到的是:

      country  survey.1    survey.2        survey.3
1 Afghanistan Influenza Hepatitis C West Nile Virus
2   Australia Influenza             West Nile Virus

有人可以幫忙嗎?

使用plyr

ddply(df,.(country),
      function(x) 
        sapply(x,function(y){
          xx= unique(y[nchar(y)>0])
          ifelse(length(xx)>0,xx,unique(y))
        }
        )
)  

     country  survey.1    survey.2        survey.3
1 Afghanistan Influenza Hepatitis C West Nile Virus
2     Algeria                                      
3   Australia Influenza             West Nile Virus

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM