在Python中使用grep导出多个输出文件

Question

我正在python中创建一个必须使用grep的代码，并且在通过grep运行它时遇到了问题。 我从“ Infile”开始，然后对该文件进行剪切和排序以创建“ Infile.ids”。 “ Infile.ids”包含在“ Infile”中的唯一ID。 然后，我必须逐行通过“ Infile”从“ Infile.ids”中运行ID，并将所有具有ID的行提取到新的单独文件中。 问题是当我在grep中运行它时，它会一次运行所有行，并且基本上给了我一堆与原始“ Infile”相同的文件，而不是单独的唯一文件。

这些是示例“ Infile”和我尝试获取的输出文件。

Infile              Infile.ids    Infile.Hello     Infile.World      Infile.Adios
Hello 1 3 5 7       Hello         Hello 1 3 5 7    World 2 4 6 8     Adios 1 2 3 4
World 2 4 6 8       World         Hello a b c d    World e f g h     Adios i j k l
Adios 1 2 3 4       Adios
Hello a b c d
World e f g h
Adios i j k l

这是我到目前为止的代码：

#!/usr/bin/python

import sys
import os

Infile = sys.argv[1]

os.system("cut -d \" \" -f1 %s | sort -u > %s.ids" % (Infile, Infile))
Infile2 = "%s.ids" % Infile

handle = open("%s.ids" % Infile, "r")
line = handle.readline()

for line in handle:
    os.system("grep \"%s\" %s > %s.%s" % (line, Infile, Infile, line))
    line = handle.readline()

handle.close()

Answer 1

当您遍历handle ，每line都会有一个换行符，显然Infile的行没有换行符（它们首先具有“ 1 3 5 7”的内容）。 这就是为什么您的grep失败的原因。

尝试做

for line in handle.readlines():
    line = line.strip()
    os.system("grep \"%s\" %s > %s.%s" % (line, Infile, Infile, line))

并删除这两个line = handle.readline()语句-如果您正在执行for循环，它将遍历读取行本身。 如果要使用显式阅读调用，则使用while循环会更合适（尽管我怀疑在这种情况下建议这样做）。

干杯

在Python中使用grep导出多个输出文件

问题描述

1 个解决方案

解决方案1
0 已采纳 2013-04-08 10:49:06

在Python中使用grep导出多个输出文件

问题描述

1 个解决方案

解决方案1 0 已采纳 2013-04-08 10:49:06

解决方案1
0 已采纳 2013-04-08 10:49:06