[英]How to copy a segment within a line of a text file into a list
如何使用Python遍历文本文件的每一行并将作者的名字复制到列表中? 我正在使用的文本文件包含以下引号,并在每个引号的结尾加上作者的名字:
Power tends to corrupt and absolute power corrupts absolutely. --- Lord Acton No man means all he says, and yet very few say all they mean, for words are slippery and thought is viscous. --- Henry B. Adams One friend in a lifetime is much; two are many; three are hardly possible. --- Henry B. Adams
尝试这个:
authors_list = []
with open('file.txt', 'r') as f:
for line in f:
text = line.rstrip('\n').split(" --- ")
if len(text) > 1:
authors_list.append(text[1])
使用正则表达式,您可以执行以下操作:
import re
import string
with open('text.txt') as f:
txt = f.readlines()
authors = re.findall('(?<=---).*?(?=\n)', '\n'.join(txt))
authors = map(string.strip, authors)
这是一些有趣的基于生成器的解决方案:
# Generate stream manipulators
def strip(stream):
"""Strips whitespace from stream entries"""
for entry in stream:
yield entry.strip()
def index(i, stream):
"""Takes the i-th element from the stream entries"""
for entry in stream:
yield entry[i]
def split(token, stream):
"""Splits the entries in the stream based based on the token"""
for entry in stream:
yield entry.split(token)
# Actual function to do the work
def authors(filename):
"""Returns a list of the authors from the file format"""
for entry in strip(index(1, split('---', open(filename)))):
yield entry
print list(authors('file.txt'))
基于生成器/过滤器/管道的解决方案可以很好地完成此类任务。
下面的香料也应该起作用。 readlines()读取整个文件并将其加载到内存中,但是当您有大文件时,请谨慎使用。 对于较小的,这应该是可以的。
n = []
with open('test1.txt') as fd:
lines = fd.readlines()
for line in lines:
n.append( line.split('---')[1].strip('\n'))
print n
输出:['Acton勋爵,'Henry B. Adams','Henry B. Adams']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.