[英]Filtering values from generator object
I have this generator type data.我有这个生成器类型数据。
type(head)
----------
generator
Its values looks like this它的值看起来像这样
for x in head:
print(x)
Record {
field1 '2022060611121280041700000070046713963'
field2 '2022-06-06 01:11:29'
field3 'NIL'
}
I'm thinking if it's possible to convert this to data frame?我在想是否可以将其转换为数据框? I could probably create a script that would loop the content of Record but I'm hoping there's a much cleaner way.
我或许可以创建一个脚本来循环记录 Record 的内容,但我希望有一种更简洁的方法。
As long as the generated contents fit into memory, then pandas
can consume it:只要生成的内容适合 memory,那么
pandas
就可以消费它:
from pandas import DataFrame
# head is a generator
df = DataFrame([x for x in head])
If the contents of the generator are too large, then you can iterate over chunks of data (using toolz ) and store each chunk, eg to csv
:如果生成器的内容太大,那么您可以迭代数据块(使用toolz )并存储每个块,例如存储到
csv
:
from pandas import DataFrame
from toolz import partition_all
n_elements = 100
for n, x in enumerate(partition_all(n_elements, head)):
df = DataFrame(x)
if n==0:
df.to_csv('test.csv', index=False, mode='w')
else:
df.to_csv('test.csv', index=False, mode='a', header=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.