简体   繁体   中英

Pandas: read_csv (read multiple tables in a single file)

I have a file (example shown below) that has multiple CSV tables. This file is uploaded to a database. I would like to do some operations on this file. For that, I was thinking of using pandas to read each table into a separate dataframe using read_csv function. However, going through the documentation, I didn't see an option to specify a subset of lines to read/parse. Is this possible? If not, are there other alternatives?

Sample file:

TABLE_1
col1,col2
val1,val2
val3,val4

TABLE_2
col1,col2,col3,col4
val1,val2,val3,val4
...

...

I can do an initial pass through the file to determine the start/end lines of each table. However, one of read_csv arguments is " filepath_or_buffer ", but I am not totally certain what the 'buffer' part is. Is it a list of strings or one big string or something else? What can I use for a buffer? Can someone point me to an small example that uses read_csv with a buffer? Thanks for any ideas.

UPDATE:

if you want to skip specific lines [0,1,5,16,57,58,59] , you can use skiprows :

df = pd.read_csv(filename, header=None, 
                 names=['col1','col2','col3'], skiprows=[0,1,5,16,57,58,59])

for skipping first two lines and reading following 100 lines you can use skiprows and nrows parameters as @Richard Telford mentioned in the comment:

df = pd.read_csv(filename, header=None, names=['col1','col2','col3'],
                 skiprows=2, nrows=100)

here is a small example for "buffer":

import io
import pandas as pd

data = """\
        Name
0  JP2015121
1    US14822
2    US14358
3  JP2015539
4  JP2015156
"""
df = pd.read_csv(io.StringIO(data), delim_whitespace=True, index_col=0)
print(df)

the same without header:

data = """\
0  JP2015121
1    US14822
2    US14358
3  JP2015539
4  JP2015156
"""
df = pd.read_csv(io.StringIO(data), delim_whitespace=True, index_col=0,
                 header=None, names=['Name'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM