简体   繁体   English

从R中的数据框列表中选择特定名称

[英]Select specific names from list of dataframes in R

Sample data: 样本数据:

df <- data.frame(names=letters[1:10],name1=rnorm(10,1,1),name2=rexp(10,2))

list <- list(df,df)

vec_name <- c("f","i","c") # desired row names 

I would like to select per list rows given the vec_name names: 我想给每个列表行选择vec_name名称:

Desired outcome: 期望的结果:

[[1]]
      names      value1    value2
   6   nd:f   -1.6323952 0.3117470
   9   nd:i    1.8270855 0.2475741
   3   nd:c    0.6978422 0.4695581   # the ordering does matter; must be as seen in vec_name

[[2]]
      names      value1    value2
   6   ad:f   -1.6323952 0.3117470
   9   ad:i    1.8270855 0.2475741
   3   ad:c    0.6978422 0.4695581

Desired output 2: Is in dataframe, which would be I believe just do.call(rbind,list) : 所需的输出2:在数据帧中,我相信这就是do.call(rbind,list)

However the clean names from vec_names should be used instead. 但是 ,应改用vec_names中的干净名称。

      names      value1    value2
   1      f   -1.6323952 0.3117470
   2      i    1.8270855 0.2475741
   3      c    0.6978422 0.4695581 
   4      f   -1.6323952 0.3117470
   5      i    1.8270855 0.2475741
   6      c    0.6978422 0.4695581

I have tried sapply ; 我试过sapply ; lapply ... for example: lapply ...例如:

lapply(list, function(x) x[grepl(vec_name,x$names),])

EDIT : PLEASE SEE THE EDITED QUESTION ABOVE. 编辑 :请参阅上面的编辑问题。

You were almost there. 你快到了 The warning message was saying: 警告消息说:

Warning messages:
1: In grepl(vec_name, x$names) :
   argument 'pattern' has length > 1 and only the first element will be used

Reason is that you provide a vector to grepl which is expecting a regex (see ?regex ). 原因是您向grepl提供了一个vector ,该vector期望使用regex (请参见?regex )。 What you want to do is to match the contents: 您要做的就是match内容:

lapply(list, function(x) x[match(vec_name,x$names),])

Which will give you a list of data.frame objects. 这将为您提供data.frame对象的list If you want to combine them afterwards just use: 如果以后要合并它们,请使用:

do.call(rbind, lapply(list, function(x) x[match(vec_name,x$names),]))

Or you use ldply from library(plyr) : 或者你用ldplylibrary(plyr)

library(plyr)
ldply(list, function(x) x[match(vec_name,x$names),])
#   names       name1     name2
# 1     f  2.01421228 0.4489627
# 2     i  0.28899891 0.8323940
# 3     c -0.01746007 1.5309936
# 4     f  2.01421228 0.4489627
# 5     i  0.28899891 0.8323940
# 6     c -0.01746007 1.5309936

And as a remark: avoid to use protected names like list for your variables to avoid unwanted effects. 另外请注意:避免对变量使用list等受保护的名称,以免产生不良影响。

Update 更新资料

Taking the comments into account ( vec_name does not match completely the names in the data.frame )you should clean first the names and then do the match . 考虑各种意见,考虑( vec_name不完全匹配的名称data.frame ),你应该先清理的名字,然后做match This is, however, assuming that your 'uncleaned' names contain the cleaned names with a pre-fix separated by a colon (':') (if this is not the case adapt the regex in the gsub statement): 但是,这是假设您的“未清除的”名称包含已清除的名称,其前缀以冒号(':')分隔(如果不是这种情况,请在gsub语句中修改regex ):

ldply(list, function(x) x[match(vec_name, gsub(".*:(.*)", "\\1", x$names)),])

for the first output : 对于第一个输出:

output1<-lapply(list,function(elt){
                       resmatch<-sapply(vec_name,function(x) regexpr(x,df$names))
                       elt<-elt[apply(resmatch,2,function(rg) which(rg>0)),]
                       colnames(elt)<-c("names","value1","value2")
                       return(elt)
                       })

>output1
[[1]]
  names     value1    value2
6  nd:f -0.2132962 0.7618105
9  nd:i -0.6580247 0.6010379
3  nd:c  0.9302625 0.1490061

[[2]]
  names     value1    value2
6  nd:f -0.2132962 0.7618105
9  nd:i -0.6580247 0.6010379
3  nd:c  0.9302625 0.1490061

For the second output, you can do what you wanted to : 对于第二个输出,您可以执行想要的操作:

output2<-do.call(rbind,output1)

> output2

   names     value1    value2
6   nd:f -0.2132962 0.7618105
9   nd:i -0.6580247 0.6010379
3   nd:c  0.9302625 0.1490061
61  nd:f -0.2132962 0.7618105
91  nd:i -0.6580247 0.6010379
31  nd:c  0.9302625 0.1490061

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM