简体   繁体   English

对于当前目录中的所有文件,请在另一个目录中查找具有相同前缀的文件。 [R

[英]For all files in current directory, find file with same prefix in another directory. R

I have a very basic question, and I apologize if it is asked elsewhere (I tried to find the answer, I really did). 我有一个非常基本的问题,对于在其他地方提出该问题,我深表歉意(我确实找到了答案)。

I have written a script that creates two directories, and saves information for a given file in each directory under the same name. 我编写了一个脚本,该脚本创建两个目录,并将给定文件的信息以相同的名称保存在每个目录中。 In one directory I have formated data to make a boxplot using ggplot, in the other directory I have saved annotation information. 在一个目录中,我使用ggplot格式化了数据以制成箱形图,在另一个目录中,我保存了注释信息。 I then make a boxplot, and I want to search the annotation directory for the corresponding annotation file so I can add the annotations to the boxplot. 然后创建一个箱线图,我想在注释目录中搜索相应的注释文件,以便可以将注释添加到箱线图。 The code is setup to perform this operation for "all files" in a given directory, so I cant simply change working directories and load the annotation file by name. 设置代码以对给定目录中的“所有文件”执行此操作,因此我不能简单地更改工作目录并按名称加载注释文件。 Here is what I got: 这是我得到的:

In the directory called ggplot/data, files are saved as: my_data_1.csv 在名为ggplot / data的目录中,文件另存为:my_data_1.csv

In the directory called ggplot/annotation, files are saved as: my_data_1.csv 在名为ggplot / annotation的目录中,文件另存为:my_data_1.csv

Final annotated graphs are saved in ggplot/graph_output. 最终带注释的图形保存在ggplot / graph_output中。

# goto ggplot data directory
setwd("/home/path/to/ggplot/data")

#look for all files
inFilePaths = list.files(path=".", pattern=glob2rx("*"), full.names=TRUE)

#make a ggplot2 boxplot for every file with
for (inFilePath in inFilePaths)
{ 
  # Read in each data file as a dataframe
  inFileData = read_csv(inFilePath)

  # Make a ggplot. **This is only part of my code to save space**
  plot1 = ggplot(data =inFileData, mapping= aes(x=Sample, y=Expression)) +
    scale_fill_manual(values=c("#606060", "#29a329"))

  # Change directories to annottaion folder
  setwd("/home/path/to/ggplot/annotation")

  ####Help!!!!#### Write something to find the file with same inFilePath name to get annotations
  ##Maybe something like this:
  inFilePaths2 = list.files(path=".", pattern=glob2rx(inFileData), full.names=TRUE)
  ##This does not work because it cant find the same inFileData file used to make the ggplot

  # annotate gglot with corresponding annotation file
  for (inFilePath in FilePaths2)
  {
    palues = read_csv(...of the file that matches the file name of the ggplot data) 
    plt2_annot <- plot1 +
      geom_text(data=pvalues, aes(x=value, y=breaks,label = paste('P:',format.pval(pval, digits=1))))
  }


  # specify size of ggplot base on number of boxes displayed using total rows of data
  n = 0.25+(0.75*(nrow(unique(select(inFileData, Gene)))))

  # Change directories to graph output folder, and save graph
  setwd("/home/path/to/ggplot/graph_output")
  ggsave(filename = paste(inFilePath, ".png"), plot=plot2, height = 1.5, width = n, units = "in")
}

Using Gregor's comments, I managed to come up with a very easy solution. 使用Gregor的评论,我设法提出了一个非常简单的解决方案。

1) I changed how I named my files in each directory so the data file and corresponding annotation file have the exact same name. 1)我更改了在每个目录中命名文件的方式,以便数据文件和相应的注释文件具有完全相同的名称。

2) Rather than trying to implement some function to find the corresponding annotation file to the current inFilePath data file, simply changing the directory to the annotation directory and re-loading the inFilePath using read_csv(inFilePath) resulted in loaded the corresponding annotation file. 2)与其尝试实现一些功能以找到与当前inFilePath数据文件相对应的注释文件,不如将目录更改为注释目录并使用read_csv(inFilePath)重新加载inFilePath导致加载了相应的注释文件。

Here is the code that ended up working for me: 这是最终为我工作的代码:

# goto ggplot data directory
setwd("/home/path/to/ggplot/data")

#look for all files
inFilePaths = list.files(path=".", pattern=glob2rx("*"), full.names=TRUE)

#make a ggplot2 boxplot for every data file
for (inFilePath in inFilePaths)
{ 
  #Need to set directory again due to directory change lower in the loop
  setwd("/home/path/to/ggplot/data")
  # Read in each data file as a dataframe
  inFileData = read_csv(inFilePath)

  #check to see which data is loaded
  print(inFileData)

  #make a ggplot from the ggplot data

  # Change directories to annotation folder
  setwd("/home/path/to/ggplot/annotation")

  #load new annotation data. The file names are the same, so loading the same file name in the annotations
  # directory actually loads the annotations for the corresponding plot
  inFileData2 = read_csv(inFilePath)

  #check to make sure the correct annotation file is loaded
  print(inFileData2)

  #add annotation to ggplot graph
  #now that I can access the correct annotation, I'll work on this part next.
  #then save the graph

  }

Thanks for the help. 谢谢您的帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM