[英]Escaped quotes in pandas read_csv
I am unable to create a dataframe which has escaped quotes when using read_csv
. 使用read_csv
时,我无法创建已转义引号的数据read_csv
。
(Note: R's read.csv
works as expected.) (注意:R的read.csv
按预期工作。)
import pandas as pd
pd.read_csv('data.csv')
#error!
CParserError: Error tokenizing data. C error: Expected 2 fields in line 4, saw 3
SEARCH_TERM,ACTUAL_URL
"bra tv bord","http://www.ikea.com/se/sv/catalog/categories/departments/living_room/10475/?se%7cps%7cnonbranded%7cvardagsrum%7cgoogle%7ctv_bord"
"tv på hjul","http://www.ikea.com/se/sv/catalog/categories/departments/living_room/10475/?se%7cps%7cnonbranded%7cvardagsrum%7cgoogle%7ctv_bord"
"SLAGBORD, \"Bergslagen\", IKEA:s 1700-tals serie","http://www.ikea.com/se/sv/catalog/categories/departments/living_room/10475/?se%7cps%7cnonbranded%7cvardagsrum%7cgoogle%7ctv_bord"
How can I read this csv and avoid this error? 如何阅读此csv并避免此错误?
My guess is that pandas is using some regular expressions which cannot handle the ambiguity and trips on the third row, or more specifically: \\"Bergslagen\\"
. 我的猜测是,大熊猫正在使用一些正则表达式,这些表达式无法处理第三行的歧义和行程,或者更具体地说: \\"Bergslagen\\"
。
It does work, but you have to indicate the escape character for the embedded quotes: 它确实有效,但你必须指出嵌入式引号的转义字符:
In [1]: data = '''SEARCH_TERM,ACTUAL_URL
"bra tv bord","http://www.ikea.com/se/sv/catalog/categories/departments/living_room/10475/?se%7cps%7cnonbranded%7cvardagsrum%7cgoogle%7ctv_bord"
"tv p\xc3\xa5 hjul","http://www.ikea.com/se/sv/catalog/categories/departments/living_room/10475/?se%7cps%7cnonbranded%7cvardagsrum%7cgoogle%7ctv_bord"
"SLAGBORD, \\"Bergslagen\\", IKEA:s 1700-tals serie","http://www.ikea.com/se/sv/catalog/categories/departments/living_room/10475/?se%7cps%7cnonbranded%7cvardagsrum%7cgoogle%7ctv_bord"'''
In [2]: df = read_csv(StringIO(data), escapechar='\\', encoding='utf-8')
In [3]: df
Out[3]:
SEARCH_TERM ACTUAL_URL
0 bra tv bord http://www.ikea.com/se/sv/catalog/categories/d...
1 tv på hjul http://www.ikea.com/se/sv/catalog/categories/d...
2 SLAGBORD, "Bergslagen", IKEA:s 1700-tals serie http://www.ikea.com/se/sv/catalog/categories/d...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.