繁体   English   中英

如何将一段文本文件中的段复制到列表中

[英]How to copy a segment within a line of a text file into a list

如何使用Python遍历文本文件的每一行并将作者的名字复制到列表中? 我正在使用的文本文件包含以下引号,并在每个引号的结尾加上作者的名字:

 Power tends to corrupt and absolute power corrupts absolutely. --- Lord Acton No man means all he says, and yet very few say all they mean, for words are slippery and thought is viscous. --- Henry B. Adams One friend in a lifetime is much; two are many; three are hardly possible. --- Henry B. Adams 

尝试这个:

authors_list = []
with open('file.txt', 'r') as f:
    for line in f:
        text = line.rstrip('\n').split(" --- ")
        if len(text) > 1:
            authors_list.append(text[1])

使用正则表达式,您可以执行以下操作:

import re
import string

with open('text.txt') as f:
    txt = f.readlines()

authors = re.findall('(?<=---).*?(?=\n)', '\n'.join(txt))
authors = map(string.strip, authors)

这是一些有趣的基于生成器的解决方案:

# Generate stream manipulators
def strip(stream):
    """Strips whitespace from stream entries"""

    for entry in stream:
        yield entry.strip()

def index(i, stream):
    """Takes the i-th element from the stream entries"""

    for entry in stream:
        yield entry[i]

def split(token, stream):
    """Splits the entries in the stream based based on the token"""

    for entry in stream:
        yield entry.split(token)

# Actual function to do the work
def authors(filename):
    """Returns a list of the authors from the file format"""

    for entry in strip(index(1, split('---', open(filename)))):
        yield entry

print list(authors('file.txt'))

基于生成器/过滤器/管道的解决方案可以很好地完成此类任务。

下面的香料也应该起作用。 readlines()读取整个文件并将其加载到内存中,但是当您有大文件时,请谨慎使用。 对于较小的,这应该是可以的。

n = []
with open('test1.txt') as fd:
    lines = fd.readlines()
    for line in lines:
        n.append( line.split('---')[1].strip('\n'))

print n

输出:['Acton勋爵,'Henry B. Adams','Henry B. Adams']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM