简体   繁体   中英

saving large streaming data in python

I have a large amount of data coming in every second in the form of python dictionaries, right now I am saving it to mySQL server as they come in but that creates a backlog thats more than a few hours. What is the best way to save the data locally and move it to a mySQL server every hour or so as a chunk to save time.I have tried redis but it cant save a list of these dictionaries which I can later move to mySQL.

A little-known fact about the Python native pickle format is that you can happily concatenate them into a file.

That is, simply open a file in append mode and pickle.dump() your dictionary into that file. If you want to be extra fancy, you could do something like timestamped files:

def ingest_data(data_dict):
    filename = '%s.pickles' % date.strftime('%Y-%m-%d_%H')
    with open(filename, 'ab') as outf:
        pickle.dump(data_dict, outf, pickle.HIGHEST_PROTOCOL)


def read_data(filename):
    with open(filename, 'rb') as inf:
        while True:
            yield pickle.load(inf)  # TODO: handle EOF error

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM