从R数据帧中删除重复和空白

Question

I apologise in advance for the data structure here, but I'm stuck with it... 我在这里为数据结构提前道歉，但我坚持下去......

I have a data frame with lots of repeats and blanks, like so: 我有一个包含大量重复和空白的数据框，如下所示：

df <- data.frame(
country=c("Afghanistan", "Afghanistan", "Algeria", "Australia", "Australia", "Australia"), 
survey.1=c("Influenza","", "","","Influenza","Influenza"), 
survey.2=c("","Hepatitis C","","","",""), 
survey.3=c("West Nile Virus", "", "", "", "", "West Nile Virus"))

      country  survey.1    survey.2        survey.3
1 Afghanistan Influenza             West Nile Virus
2 Afghanistan           Hepatitis C                
3     Algeria                                      
4   Australia                                      
5   Australia Influenza                            
6   Australia Influenza             West Nile Virus

I need to remove the repeats and blanks but keep the same data structure (I don't know what you would call this... 'concentrating' as opposed to 'aggregating' maybe?). 我需要删除重复和空白但保持相同的数据结构（我不知道你会称之为什么......'集中'而不是'聚合'可能？）。 So what I'd end up with is this: 所以我最终得到的是：

      country  survey.1    survey.2        survey.3
1 Afghanistan Influenza Hepatitis C West Nile Virus
2   Australia Influenza             West Nile Virus

Can anyone help? 有人可以帮忙吗？

Answer 1

Using plyr : 使用plyr ：

ddply(df,.(country),
      function(x) 
        sapply(x,function(y){
          xx= unique(y[nchar(y)>0])
          ifelse(length(xx)>0,xx,unique(y))
        }
        )
)  

     country  survey.1    survey.2        survey.3
1 Afghanistan Influenza Hepatitis C West Nile Virus
2     Algeria                                      
3   Australia Influenza             West Nile Virus

从R数据帧中删除重复和空白

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-02-06 10:26:11

从R数据帧中删除重复和空白

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-02-06 10:26:11

解决方案1
2 已采纳 2014-02-06 10:26:11