简体   繁体   中英

can't read quotes correctly with pandas read_csv

I have a file test.tsv with some rows having quotes and it basically skips stops using the new line character as a new row indicator. So if I have a file

" m     1
what does comoda mean   1
the poke co     1
dmf     1
"g      1

and I use

test = pd.read_csv("test.tsv", 
                  sep='\t')

I get the all rows as one row

 m\t1\nwhat does comoda mean\t1\nthe poke co\t1\ndmf\t1\ng  1

I want to keep all rows intact and get the output

" m     1
what does comoda mean   1
the poke co     1
dmf     1
"g      1

Is there a way to solve this double quote issue? I have multiple rows coming out as a single row wherever I have double quotes opened up until there is double quote to close that. After that the rows are interpreted correctly.

You can control the parsing of quotes using the quoting keyword parameter of pandas.read_csv . In your case you can disable quoting like so:

>>> import pandas as pd
>>> import csv

>>> pd.read_csv("test.tsv", sep='\t', quoting=csv.QUOTE_NONE)                 

                     " m  1
0  what does comoda mean  1
1            the poke co  1
2                    dmf  1
3                     "g  1

Note that the first row is being interpreted as a column header. Pass header=None to prevent that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM