繁体   English   中英

对于当前目录中的所有文件,请在另一个目录中查找具有相同前缀的文件。 [R

[英]For all files in current directory, find file with same prefix in another directory. R

我有一个非常基本的问题,对于在其他地方提出该问题,我深表歉意(我确实找到了答案)。

我编写了一个脚本,该脚本创建两个目录,并将给定文件的信息以相同的名称保存在每个目录中。 在一个目录中,我使用ggplot格式化了数据以制成箱形图,在另一个目录中,我保存了注释信息。 然后创建一个箱线图,我想在注释目录中搜索相应的注释文件,以便可以将注释添加到箱线图。 设置代码以对给定目录中的“所有文件”执行此操作,因此我不能简单地更改工作目录并按名称加载注释文件。 这是我得到的:

在名为ggplot / data的目录中,文件另存为:my_data_1.csv

在名为ggplot / annotation的目录中,文件另存为:my_data_1.csv

最终带注释的图形保存在ggplot / graph_output中。

# goto ggplot data directory
setwd("/home/path/to/ggplot/data")

#look for all files
inFilePaths = list.files(path=".", pattern=glob2rx("*"), full.names=TRUE)

#make a ggplot2 boxplot for every file with
for (inFilePath in inFilePaths)
{ 
  # Read in each data file as a dataframe
  inFileData = read_csv(inFilePath)

  # Make a ggplot. **This is only part of my code to save space**
  plot1 = ggplot(data =inFileData, mapping= aes(x=Sample, y=Expression)) +
    scale_fill_manual(values=c("#606060", "#29a329"))

  # Change directories to annottaion folder
  setwd("/home/path/to/ggplot/annotation")

  ####Help!!!!#### Write something to find the file with same inFilePath name to get annotations
  ##Maybe something like this:
  inFilePaths2 = list.files(path=".", pattern=glob2rx(inFileData), full.names=TRUE)
  ##This does not work because it cant find the same inFileData file used to make the ggplot

  # annotate gglot with corresponding annotation file
  for (inFilePath in FilePaths2)
  {
    palues = read_csv(...of the file that matches the file name of the ggplot data) 
    plt2_annot <- plot1 +
      geom_text(data=pvalues, aes(x=value, y=breaks,label = paste('P:',format.pval(pval, digits=1))))
  }


  # specify size of ggplot base on number of boxes displayed using total rows of data
  n = 0.25+(0.75*(nrow(unique(select(inFileData, Gene)))))

  # Change directories to graph output folder, and save graph
  setwd("/home/path/to/ggplot/graph_output")
  ggsave(filename = paste(inFilePath, ".png"), plot=plot2, height = 1.5, width = n, units = "in")
}

使用Gregor的评论,我设法提出了一个非常简单的解决方案。

1)我更改了在每个目录中命名文件的方式,以便数据文件和相应的注释文件具有完全相同的名称。

2)与其尝试实现一些功能以找到与当前inFilePath数据文件相对应的注释文件,不如将目录更改为注释目录并使用read_csv(inFilePath)重新加载inFilePath导致加载了相应的注释文件。

这是最终为我工作的代码:

# goto ggplot data directory
setwd("/home/path/to/ggplot/data")

#look for all files
inFilePaths = list.files(path=".", pattern=glob2rx("*"), full.names=TRUE)

#make a ggplot2 boxplot for every data file
for (inFilePath in inFilePaths)
{ 
  #Need to set directory again due to directory change lower in the loop
  setwd("/home/path/to/ggplot/data")
  # Read in each data file as a dataframe
  inFileData = read_csv(inFilePath)

  #check to see which data is loaded
  print(inFileData)

  #make a ggplot from the ggplot data

  # Change directories to annotation folder
  setwd("/home/path/to/ggplot/annotation")

  #load new annotation data. The file names are the same, so loading the same file name in the annotations
  # directory actually loads the annotations for the corresponding plot
  inFileData2 = read_csv(inFilePath)

  #check to make sure the correct annotation file is loaded
  print(inFileData2)

  #add annotation to ggplot graph
  #now that I can access the correct annotation, I'll work on this part next.
  #then save the graph

  }

谢谢您的帮助。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM