[英]pandas.read_csv is ignoring quoting of strings
I am having some trouble reading/importing a csv file into a pandas dataframe. The import is not skipping the comma that is enclosed in quotes.我在将 csv 文件读取/导入到 pandas dataframe 时遇到一些问题。导入不会跳过引号中的逗号。
I have tried different options for quotechar but none made any difference我为 quotechar 尝试了不同的选项,但没有任何区别
import csv
import pandas
df = pandas.read_csv( 'test_quote.csv', header=None,sep=',', quotechar='\"', quoting=csv.QUOTE_MINIMAL, encoding='ascii', engine='python')
print(df)
code output
$ python3 test_quote.py
0 1 2 3 4 5 6
0 201571 2080 "December 2 2022" "November 1 - November 30 2022" 487.29
1 345741 5377 "December 3 2022" "November 1 - November 30 2022" 729.35
2 995349 3672 "December 2 2022" "November 1 - November 30 2022" 937.33
3 475601 3672 "December 2 2022" "November 1 - November 30 2022" 790.17
4 228548 3672 "December 7 2022" "November 1 - November 30 2022" 682.38
expected output
$ python3 test_quote.py
0 1 2 3 4
0 201571 2080 "December 2, 2022" "November 1 - November 30, 2022" 487.29
1 345741 5377 "December 3, 2022" "November 1 - November 30, 2022" 729.35
2 995349 3672 "December 2 , 2022" "November 1 - November 30 , 2022" 937.33
3 475601 3672 "December 2 , 2022" "November 1 - November 30 , 2022" 790.17
4 228548 3672 "December 7, 2022" "November 1 - November 30, 2022" 682.38
input file = test_quote.csv
201571, 2080, "December 2, 2022", "November 1 - November 30, 2022", 487.29
345741, 5377, "December 3, 2022", "November 1 - November 30, 2022", 729.35
995349, 3672, "December 2 , 2022", "November 1 - November 30 , 2022", 937.33
475601, 3672, "December 2 , 2022", "November 1 - November 30 , 2022", 790.17
228548, 3672, "December 7, 2022", "November 1 - November 30, 2022", 682.38
The extra spaces after the commas are causing the issue.逗号后的额外空格导致了问题。 Use the following, but note most of your parameters are already the defaults.使用以下内容,但请注意您的大部分参数已经是默认值。
import csv
import pandas
df = pandas.read_csv( 'test_quote.csv', header=None, skipinitialspace=True)
print(df)
Output: Output:
0 1 2 3 4
0 201571 2080 December 2, 2022 November 1 - November 30, 2022 487.29
1 345741 5377 December 3, 2022 November 1 - November 30, 2022 729.35
2 995349 3672 December 2 , 2022 November 1 - November 30 , 2022 937.33
3 475601 3672 December 2 , 2022 November 1 - November 30 , 2022 790.17
4 228548 3672 December 7, 2022 November 1 - November 30, 2022 682.38
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.