[英]How to preprocess a text stream on the fly in Python?
What I need is a Python 3 function (or whatever) that would take a text stream (like sys.stdin
or like that returned by open(file_name, "rt")
) and return a text stream to be consumed by some other function but remove all the spaces, replace all tabs with commas and convert all the letters to lowercase on the fly (the "lazy" way) as the data is read by the consumer code. 我需要的是一个Python 3函数(或其他),它将采用文本流(如sys.stdin
或类似open(file_name, "rt")
)返回的函数,并返回一个文本流,供其他函数使用但删除所有空格,用逗号替换所有选项卡,并在消费者代码读取数据时动态地将所有字母转换为小写(“懒惰”方式)。
I assume there is a reasonably easy way to do this in Python 3 like something similar to list comprehensions but don't know what exactly might it be so far. 我假设在Python 3中有一种相当简单的方法可以像列表推导类似,但不知道到目前为止它究竟是什么。
I am not sure this is what you mean, but the easiest way i can think of is to inherit from file (the type returned from open) and override the read method to do all the things you want after reading the data. 我不确定这是什么意思,但我能想到的最简单的方法是继承文件(从open返回的类型)并覆盖read方法,以便在读取数据后执行所需的所有操作。 A simple implementation would be: 一个简单的实现是:
class MyFile(file):
def read(*args, **kwargs):
data = super().read(*args,**kwargs)
# process data eg. data.replace(' ',' ').replace('\t', ',').lower()
return data
I believe what you are looking for is the io
module, more specifically a io.StringIO
. 我相信你要找的是io
模块,更具体地说是io.StringIO
。
You can then use the open()
method to get the initial data and modify, then pass it around: 然后,您可以使用open()
方法获取初始数据并进行修改,然后传递它:
with open(file_name, 'rt') as f:
stream = io.StringIO(f.read().replace(' ','').replace('\t',',').lower())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.