[英]python: pull lines from text file when first column matches string from list
I have a list ['dog', 'cat', 'snake', 'lizard']
. 我有一个列表['dog', 'cat', 'snake', 'lizard']
。 I want to use this list to extract lines from a text file. 我想使用此列表从文本文件中提取行。 My text file is tab delimited with new line characters at the end of each line. 我的文本文件用制表符分隔,每行末尾用换行符分隔。 Each line has 4 columns, the first being one of the names from my list. 每行有4列,第一列是我列表中的名称之一。 The first five lines would look like: 前五行如下所示:
dog data1 data2 data3
dog data1 data2 data3
cat data1 data2 data3
snake data1 data2 data3
lizard data1 data2 data3
for many lines. 许多行。
I want to make a text file for each of the items in my list. 我想为列表中的每个项目创建一个文本文件。 In each new file I want every line from the original file where the first column matches the name in the list/new file. 在每个新文件中,我都希望原始文件中的每一行都与第一列匹配列表/新文件中的名称。 This is the code I have written: 这是我编写的代码:
filename = "data.txt"
f = open(filename, 'r')
#my list is named Species
for names in Species:
with open(str(names) + ".txt", 'w') as g:
for line in f:
row = line.split()
if names == row[0]:
g.write(row)
I am able to create the text files I wish to write to but nothing is being written to the files. 我可以创建要写入的文本文件,但是没有写入任何文件。 I am getting no error messages. 我没有收到任何错误消息。 In the end, I would like to be able to extract only some of the columns of data for each line that I am interested in putting into my new text file. 最后,我只希望为我有兴趣放入新文本文件中的每一行提取一些数据列。
You should be getting an error from trying to write a list directly to a file (not legal in Python): 尝试将列表直接写到文件中会导致错误(在Python中不合法):
Python 2.7: Python 2.7:
Python 2.7.10 (default, Sep 13 2015, 20:30:50)
[GCC 5.2.1 20150911] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> with open("test", "w") as f:
... f.write([1,2,3,4])
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: expected a character buffer object
>>>
The write
isn't being called, probably because there isn't a line that matches Species[0]
. 未调用write
操作,可能是因为没有与Species[0]
相匹配的行。 When the top-level for
loop is called again on Species[1]
, f
is already at end-of-file and won't give any more lines. 当在Species[1]
上再次调用顶层的for
循环时, f
已经在文件末尾,不再提供任何行。 seek
to the beginning of the file at the start of the loop: seek
在循环的开始文件的开头:
for name in Species:
f.seek(0)
with open(str(names) + ".txt", "w") as g:
for line in f:
if line.startswith(name):
g.write(line)
Alternatively (this is what I'd do) you can scan through f
once, and assign each line to the proper animal as you process it: 另外(这是我要做的),您可以扫描一次f
,然后在处理时将每行分配给适当的动物:
records = {}
for line in f:
animal = line.split()[0]
if not records.get(animal):
records[animal] = []
records[animal].append(line)
for animal in records.keys():
with open("{}.txt".format(animal), "w") as f:
for line in records[animal]:
f.write(line)
Here's the updated code! 这是更新的代码!
Species = ['dog', 'cat', 'snake', 'lizard']
filename = "data.txt"
for names in Species:
with open(str(names) + ".txt", 'w') as g:
f = open(filename, 'r')
for line in f:
row = line.split()
if names == row[0]:
g.write(str(row))
f.close()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.