[英]How to read specific lines from a text file in python?
I have a text file that contains a lot of data. 我有一个包含大量数据的文本文件。 I want to be able to read the text file and write a new text file.
我希望能够读取文本文件并写入新的文本文件。 However on the new text file I don't want it to include some part of the orginal.
但是,在新的文本文件上,我不希望它包含原始字符的某些部分。
For example the text file has 例如,文本文件具有
------------------------
Age: 39
Gender: Female
Smoking: Yes
remarks: something about the person
-----------------------
Age: 52
Gender: Male
Smoking: Yes
remarks: something about the person
-----------------------
How do I get the new file to only read in age and gender so that the new text file will look like (also including the dashes that are divide each entry): 如何使新文件仅按年龄和性别读取,以便新文本文件看起来像(还包括将每个条目分开的破折号):
-----------------------
Age: 39
Gender: Female
-----------------------
Age: 52
Gender: Male
-----------------------
I've seen a couple of codes and other questions but they all are not just removing specific lines. 我已经看到了几个代码和其他问题,但它们都不只是删除特定的行。
with open('path/to/infile') as infile, open('path/to/outfile', 'w') as outfile:
for line in infile:
if line.startswith(("Age", "Gender", "----")):
outfile.write(line)
Alternatively with grep
: 或者使用
grep
:
grep -ioP '^-.*$|^Age:.*$|^Gender:.*$' path/to/infile.txt > path/to/outfile.txt
import re
file = open('filename.txt','rb').read()
a = re.findall(r'Age: (\d+)\nGender: (Male|Female)', file)
print "-----------------------"
for n in a:
print 'Age: '+n[0]+'\nGender: '+n[1]
print "-----------------------"
You can be even lazier and grab the Dashes in the regex too 您甚至可以变得更懒惰,并且也可以在正则表达式中获取Dashs
a = re.findall(r'Age: (\d+)\nGender: (Male|Female)(?:.*\n){3}(\-*)', file)
for n in a:
print "Age: "+n[0]+ "\nGender: "+n[1]+"\n" + n[2]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.