简体   繁体   中英

How to split huge CSV file?

I have a csv file with 40k rows and 1 column because all the data is separated by semicolons.

insert_date;currency_from;currency_to;currency_value
0   2017-01-02 00:00:00.000;EUR;TL;3.7073
1   2017-01-02 00:00:00.000;USD;TL;3.5445
2   2017-01-02 00:00:00.000;GBP;TL;4.3510
3   2017-01-02 00:00:00.000;BTC;USD;0.0000
4   2017-01-02 00:00:00.000;EUR;USD;1.0459

This is what my data looks like as a pandas dataframe. I want to split on the semicolon to make separate columns.

In pandas, this is done with the sep parameter per the docs :

import pandas as pd

df = pd.read_csv('/path/to/file.csv', sep=';')

Do you mean you would like to have a list of dicts?

import csv

with open('your/path/to/your/file.csv') as f:
    data = [i for i in csv.DictReader(f, delimiter=';')

process_your_data()

Try using str.spilt()

Syntax: Series.str.split(pat=None, n=-1, expand=False) Parameters:

pat: String value, separator or delimiter to separate string at. n: Numbers of max separations to make in a single string, default is -1 which means all. expand: Boolean value, returns a data frame with different value in different columns if True. Else it returns a series with list of strings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM