Splitting Large CSV file into multiple sheets in a single Excel using Python

Question

I am using this piece of code for reading a csv(around 1 GB) using pandas and then writing into multiple excel sheets using chunksize.

with pd.ExcelWriter('/tmp/output.xlsx',engine='xlsxwriter') as writer:
        reader = pd.read_csv(f'/tmp/{file_name}', sep=',', chunksize=1000000)
        for idx, chunk in enumerate(reader):
            chunk.to_excel(writer, sheet_name=f"Report (P_{idx + 1})", index=False)
        writer.save()

This approach is taking a lot of time.Can anyone please suggest any approaches to reduce this time?

Answer 1

Some days ago i have faced same problem so i tried with

you can use library called as vaex [1]: https://vaex.readthedocs.io/en/latest/

Or if you to to do itself with pandas try to use apache pyspark

Or use can use Google colud with 1200 credit

Splitting Large CSV file into multiple sheets in a single Excel using Python

Question

1 answers

solution1
0 2020-08-12 06:54:27

Splitting Large CSV file into multiple sheets in a single Excel using Python

Question

1 answers

solution1 0 2020-08-12 06:54:27

solution1
0 2020-08-12 06:54:27