简体   繁体   English

自定义生成器用于文件筛选

[英]Custom generator for file filter

I'm writing a small wrapper class around open that will filter out particular lines from a text file and then split them into name/value pairs before passing them back to the user. 我正在编写一个围绕open的小包装类,它将从文本文件中过滤出特定行,然后在将它们传递回用户之前将它们拆分为名称/值对。 Naturally, this process lends itself to being implemented using generators. 当然,这个过程有助于使用生成器实现。

My "file" class 我的“文件”类

class special_file:
    def __init__(self, fname):
        self.fname = fname

    def __iter__(self):
        return self

    def __next__(self):
        return self.next()

    def next(self):
        with open(self.fname, 'r') as file:
            for line in file:
                line = line.strip()
                if line == '':
                    continue
                name,value = line.split()[0:2]
                if '%' in name:
                    continue
                yield name,value
            raise StopIteration()

Userland code 用户名代码

for g in special_file('input.txt'):
    for n,v in g:
        print(n,v)

My code, sadly, has two enormous problems: 1) special_file returns a generator when it really needs to return a tuple , and 2) the StopIteration() exception is never raised so the file is read repeatedly ad infinitum . 遗憾的是,我的代码有两个巨大的问题:1) special_file在真正需要返回元组时返回生成器,2)永远不会引发StopIteration()异常,因此无限期地重复读取文件 I have a sneaking suspicion that these two issues are related, but my understanding of generators and iterable sequences is fairly limited. 我怀疑这两个问题是相关的,但我对生成器和可迭代序列的理解相当有限。 Have I missed something painfully obvious about implementing a generator? 我是否遗漏了一些关于实施发电机的痛苦明显的事情?

Edit: 编辑:

I fixed my infinite reading problem by moving the first generator outside of the loop and then just looping over it. 我通过将第一个生成器移动到循环外部然后循环遍历它来修复我的无限读取问题。

g = special_file('input.txt')
k = next(g)
for n,v in k:
    print(n,v)

However, I would like the user to be able to use it like a normal call to open : 但是,我希望用户能像正常的open电话一样使用它:

for n,v in special_file('input.txt'):
    print(n,v)

You've implemented an iterator, in terms of using a generator. 在使用生成器方面,您已经实现了迭代器。 Just write the generator directly. 直接写出发电机。

def special_file(filename):
    with open(filename, 'r') as file:
        for line in file:
            line = line.strip()
            if line == '':
                continue
            name, value, *_ = line.split()
            if '%' in name:
                continue
            yield name, value

See here for an overview of what it means to be iterable, what an iterator is, and python's protocols for using them. 请参阅此处 ,了解可迭代的含义,迭代器的含义以及python使用它们的协议。

Just change 只是改变

def __iter__(self):
    return self

to

def __iter__(self):
    return next(self)

and it works as expected! 它按预期工作!

Thanks to @Leva7 for the suggestion. 感谢@ Leva7的建议。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM