如何僅使用file1中的索引從file2獲取值（行）？

Question

我有一個只有索引的file1，第二個file2包含file1的這些索引的值。 如何使用file1中的索引從file2獲取這些值，並將其值輸出到第三個文件。為簡單起見，file1中的每個索引在file2中都有其關聯的值。

例如：

文件1：

2
3
4

file2內容：

預期結果

7.50 0.67
0.23 0.78
0.45 0.49

file1＃僅包含索引file2＃file1中的每個索引都包含具有關聯索引的值

fname = file1.readlines()
fname2 = file2.readlines()
outfile = open('Values.txt','w')

for index in fname:
  for line in fname2:
    if line == index:
      outfile.writelines(line)

print "all indices' values have been written to a file success

Answer 1

這些解決方案不依賴於文件1的排序，但是它們確實將文件2加載到內存中，如果文件2很大，則代價可能很高。 但是，您會注意到第一個示例中的wanted_lines和lines_out是生成器，它們應該節省少量內存。

這個例子沒有錯誤處理，但是基本上是您所需要的。 我會在幾秒鍾內把一個更好的組合在一起。

wanted_lines = (int(line) for line in open(file1).readlines())
all_lines = [line.strip() for line in open(file1).readlines()]
lines_out = (all_lines[index] for index in wanted_lines)
open(file3, 'w').writelines(lines_out)

更好：

all_lines = [line.strip() for line in open(file2).readlines()]
lines_out = []
for line in open(file1).readlines():
  try:
    index = int(line)
    lines_out.append(all_lines[index] + '\n')
  except IndexError:
    print file1, "is only", len(file1), "lines long, therefore has no", index+1, "th line."
  except:
    print "could not coerce", line.strip(), "to an int"
open(file3, 'w').writelines(lines_out)

Answer 2

fname = file1.readlines()
fname = [ int(i) for i in fname]
f = open("file2")
for number,line in enumerate(f):
    if number in fname :
         print line.rstrip()
f.close()

Answer 3

def copyLines(infname, outfname, lines, firstLine=0):
    lines = list(set(lines))   # remove duplicates
    lines.sort(reverse=True)   # sort in descending order
    with open(infname, 'r') as inf, open(outfname, 'w') as outf:
        try:
            i = firstLine
            while lines:
                seek = lines.pop()
                while i<seek:
                    inf.next()
                    i += 1
                outf.write(inf.next())
                i += 1
        except StopIteration:  # hit end of file
            pass

def main():
    with open('file1') as inf:
        linesToRead = [int(ln) for ln in inf]

    copyLines('file2', 'Values.txt', linesToRead)        

if __name__=="__main__":
    main()

請注意，如果找到了所有需要的行，則此操作將盡早退出（即，如果您只想要1000行文件的第3-9行，它將僅讀取到第9行）。

Answer 4

如果您只想這樣做，可以使用oneliner bash

join file1 <(grep -v '^$' file2 | cat -n ) | cut -d ' ' -f 2- > Values.txt

但是這里的索引從1開始而不是0。要從0開始：

join <(awk '{print $1+1}' < file1)  <(grep -v '^$' file2 | cat -n) | cut -d ' ' -f 2- > Values.txt

Answer 5

如果不必按照在file1中出現的順序來寫入file2的行，並且如果file1的內容足夠小以適合RAM，那么這應該有效地做到這一點：

outfile = open('Values.txt','w')
desired = set(int(line) for line in open('file1').readlines())
for index, line in enumerate(open('file2')):
    if index in desired:
        outfile.write(line)

這與kurumi的答案不同，主要在於它使用一個集來保存來自file1的行號（用O（1）而不是O（n）來檢查是否發出行），並且它使用file.write，因此不會更改空白從file2的原始行開始。

Answer 6

此版本不假定file1中的索引已排序。

indices = [int(x) for x in file("file1").readlines()]
data = file("file2").readlines()

for i in indices:
    print data[i]

如何僅使用file1中的索引從file2獲取值（行）？

問題描述

6 個解決方案

解決方案1
1 2011-03-22 15:01:40

解決方案2
0 2011-03-22 15:02:19

解決方案3
0 2011-03-22 15:02:45

解決方案4
0 2011-03-22 15:25:26

解決方案5
0 2011-03-22 16:00:34

解決方案6
0 2011-03-22 17:40:05

如何僅使用file1中的索引從file2獲取值（行）？

問題描述

6 個解決方案

解決方案1 1 2011-03-22 15:01:40

解決方案2 0 2011-03-22 15:02:19

解決方案3 0 2011-03-22 15:02:45

解決方案4 0 2011-03-22 15:25:26

解決方案5 0 2011-03-22 16:00:34

解決方案6 0 2011-03-22 17:40:05

解決方案1
1 2011-03-22 15:01:40

解決方案2
0 2011-03-22 15:02:19

解決方案3
0 2011-03-22 15:02:45

解決方案4
0 2011-03-22 15:25:26

解決方案5
0 2011-03-22 16:00:34

解決方案6
0 2011-03-22 17:40:05