简体繁体中英

Pandas dataframe to_csv - split into multiple output files

原文 2017-06-12 14:45:27 7 2 python/ pandas

What is the best /easiest way to split a very large data frame (50GB) into multiple outputs (horizontally)?

I thought about doing something like:

stepsize = int(1e8)
for id, i in enumerate(range(0,df.size,stepsize)): 
    start = i 
    end = i + stepsize-1 #neglect last row ...
    df.ix[start:end].to_csv('/data/bs_'+str(id)+'.csv.out')

But I bet there is a smarter solution out there?

As noted by jakevdp , HDF5 is a better way to store huge amounts of numerical data, however it doesn't meet my business requirements.

2 answers

Use id in the filename else it will not work. You missed id , and without id , it gives an error.

for id, df_i in  enumerate(np.array_split(df, number_of_chunks)):
    df_i.to_csv('/data/bs_{id}.csv'.format(id=id))

This answer brought me to a satisfying solution using:

numpy.array_split(object, number_of_chunks)

for idx, chunk in enumerate(np.array_split(df, number_of_chunks)):
    chunk.to_csv(f'/data/bs_{idx}.csv')

Pandas Dataframe to_csv format output

Pandas DataFrame to_csv writing [0 0 0 …, 0 0 0]

Pandas dataframe transpose, to_csv

Pandas to_csv with multiple seperators

"AttributeError" in a multiple dataframe for to_csv

Pandas to_csv multiple pages

Split a pandas dataframe in multiple csv files keeping groups together

CSV dialect in pandas DataFrame to_csv (python)

pandas to_csv split some rows in 2 lines

Pandas Dataframe to_csv flipping column names?

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Pandas Dataframe to_csv format output Pandas DataFrame to_csv writing [0 0 0 …, 0 0 0] Pandas dataframe transpose, to_csv Pandas to_csv with multiple seperators "AttributeError" in a multiple dataframe for to_csv Pandas to_csv multiple pages Split a pandas dataframe in multiple csv files keeping groups together CSV dialect in pandas DataFrame to_csv (python) pandas to_csv split some rows in 2 lines Pandas Dataframe to_csv flipping column names?

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM