Python讀取目錄中的文件

Question

我有一個.csv，其中2列中包含3000行數據，如下所示：

uc007ayl.1  ENSMUSG00000041439
uc009mkn.1  ENSMUSG00000031708
uc009mkn.1  ENSMUSG00000035491

在另一個文件夾中，我有一個名稱如下的圖形：

uc007csg.1_nt_counts.txt
uc007gjg.1_nt_counts.txt

您應該注意到這些圖的名稱與我的第一列的格式相同

我正在嘗試使用python識別具有圖形的行並在新的.txt文件中打印第二列的名稱

這些是我的密碼

import csv
with open("C:/*my dir*/UCSC to Ensembl.csv", "r") as f:
reader = csv.reader(f, delimiter = ',')
    for row in reader:
        print row[0]

但是據我所知，我被困住了。

Answer 1

你快到了：

import csv
import os.path
with open("C:/*my dir*/UCSC to Ensembl.csv", "rb") as f:
    reader = csv.reader(f, delimiter = ',')
    for row in reader:
        graph_filename = os.path.join("C:/folder", row[0] + "_nt_counts.txt")
        if os.path.exists(graph_filename):
            print (row[1])

請注意，重復調用os.path.exists可能會減慢該過程的速度，尤其是在目錄位於遠程文件系統上且文件沒有明顯超過CSV文件中的行數的情況下。 您可能要改用os.listdir ：

import csv
import os

graphs = set(os.listdir("C:/graph folder"))
with open("C:/*my dir*/UCSC to Ensembl.csv", "rb") as f:
    reader = csv.reader(f, delimiter = ',')
    for row in reader:
        if row[0] + "_nt_counts.txt" in graphs:
            print (row[1])

Answer 2

首先，嘗試查看print row[0]確實提供了正確的文件標識符。

其次，使用row[0]連接文件路徑，並使用os.path.exists(path)檢查此完整路徑是否存在（如果文件確實存在os.path.exists(path) （請參閱http://docs.python.org/library /os.path.html#os.path.exists ）。

如果退出，則可以使用f2.write("%s\\n" % row[1]將row [1]（第二列）寫入新文件f2.write("%s\\n" % row[1]當然，首先必須打開f2進行寫入）。

Answer 3

好吧，下一步將是檢查文件是否存在？ 有幾種方法，但是我喜歡EAFP方法。

try:
   with open(os.path.join(the_dir,row[0])) as f: pass
except IOError:
   print 'Oops no file'

the_dir是文件所在的目錄。

Answer 4

result = open('result.txt', 'w')
for line in open('C:/*my dir*/UCSC to Ensembl.csv', 'r'):
    line = line.split(',')
    try:
        open('/path/to/dir/' + line[0] + '_nt_counts.txt', 'r')
    except:
        continue
    else:
        result.write(line[1] + '\n')
result.close()

Answer 5

import csv
import os

# get prefixes of all graphs in another directory
suff = '_nt_counts.txt'
graphs = set(fn[:-len(suff)] for fn in os.listdir('another dir') if fn.endswith(suff))

with open(r'c:\path to\file.csv', 'rb') as f:
    # extract 2nd column if the 1st one is a known graph prefix
    names = (row[1] for row in csv.reader(f, delimiter='\t') if row[0] in graphs)
    # write one name per line
    with open('output.txt', 'w') as output_file:
        for name in names:
            print >>output_file, name

Python讀取目錄中的文件

問題描述

5 個解決方案

解決方案1
3 已采納 2012-07-31 10:25:46

解決方案2
1 2012-07-31 10:24:18

解決方案3
0 2012-07-31 10:25:19

解決方案4
0 2012-07-31 10:28:28

解決方案5
0 2012-07-31 10:41:17

Python讀取目錄中的文件

問題描述

5 個解決方案

解決方案1 3 已采納 2012-07-31 10:25:46

解決方案2 1 2012-07-31 10:24:18

解決方案3 0 2012-07-31 10:25:19

解決方案4 0 2012-07-31 10:28:28

解決方案5 0 2012-07-31 10:41:17

解決方案1
3 已采納 2012-07-31 10:25:46

解決方案2
1 2012-07-31 10:24:18

解決方案3
0 2012-07-31 10:25:19

解決方案4
0 2012-07-31 10:28:28

解決方案5
0 2012-07-31 10:41:17