简体   繁体   English

生成器的过滤值 object

[英]Filtering values from generator object

I have this generator type data.我有这个生成器类型数据。

type(head)
----------
generator

Its values looks like this它的值看起来像这样

for x in head:
    print(x)

  Record {
  field1         '2022060611121280041700000070046713963'
  field2         '2022-06-06 01:11:29'
  field3         'NIL'
  }

I'm thinking if it's possible to convert this to data frame?我在想是否可以将其转换为数据框? I could probably create a script that would loop the content of Record but I'm hoping there's a much cleaner way.我或许可以创建一个脚本来循环记录 Record 的内容,但我希望有一种更简洁的方法。

As long as the generated contents fit into memory, then pandas can consume it:只要生成的内容适合 memory,那么pandas就可以消费它:

from pandas import DataFrame

# head is a generator
df = DataFrame([x for x in head])

If the contents of the generator are too large, then you can iterate over chunks of data (using toolz ) and store each chunk, eg to csv :如果生成器的内容太大,那么您可以迭代数据块(使用toolz )并存储每个块,例如存储到csv

from pandas import DataFrame
from toolz import partition_all

n_elements = 100

for n, x in enumerate(partition_all(n_elements, head)):
    df = DataFrame(x)
    if n==0:
        df.to_csv('test.csv', index=False, mode='w')
    else:
        df.to_csv('test.csv', index=False, mode='a', header=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM