[英]Read CSV file that uses doubles quotes to separate values within columns and comma's to separate columns in Python
There's a file that uses comma's to separate columns as well as indicate empty column values.有一个文件使用逗号分隔列以及指示空列值。 Moreover, in the same file double quotes are used to separate field values, where commas are being used to separate the values within the field.
此外,在同一个文件中,双引号用于分隔字段值,其中逗号用于分隔字段内的值。 If column has only single value, double quotes are not being used.
如果列只有单个值,则不使用双引号。
Example:例子:
col1,col2,col3,col4,col5,col6
name, age,,,"cat,dog", year
name,age,weight,height,cat,year
"first name,last name",age, weight,,"dog, cat, another dog",,
Expected result预期结果
col1 ![]() |
col2 ![]() |
col3 ![]() |
col4 ![]() |
col5 ![]() |
col6 ![]() |
---|---|---|---|---|---|
name![]() |
age![]() |
dog, cat![]() |
year![]() |
||
name![]() |
age![]() |
weight![]() |
height![]() |
cat![]() |
year![]() |
first name, last name![]() |
age![]() |
weight![]() |
dog, cat, another dog![]() |
Another important thing, if that matters, is that the CSV uses Windows-1252 encoding .另一个重要的事情,如果这很重要,那就是 CSV 使用Windows-1252 编码。
Your CSV is not in the right format.您的 CSV 格式不正确。 You have an extra comma at the end of last row.
最后一行的末尾有一个额外的逗号。 If you remove that extra comma this code will work:
如果您删除多余的逗号,则此代码将起作用:
import pandas as pd
df = pd.read_csv('data/test_data.txt', quotechar='"', delimiter=',')
print(df)
The output is this: output 是这样的:
col1 col2 col3 col4 col5 col6
0 name age NaN NaN cat,dog year
1 name age weight height cat year
2 first name,last name age weight NaN dog, cat, another dog NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.