I am trying to use Pandas to convert my xlsx file to CSV. Some of the data has double quotes and so I am using escape character. However, it doesn't seem to work.
test.xslx
test.csv (output)
"name"|"address"
"abc"|"""canada"""
"xyz"|"""US"""
expected output
"name"|"address"
"abc"|\""canada"\"
"xyz"|\""US"\"
convert.py
import csv
import pandas as pd
df = pd.read_excel("NLP_data.xlsx")
df = pd.read_excel("test.xlsx")
df.to_csv("test.csv", sep="|", index=False, quoting=csv.QUOTE_ALL, encoding="utf-8", escapechar='\\')
What's the purpose of escape character? Shouldn't it escape double quotes as it's part of data?
Are you sure your expected output is correct? I would expect it to look like the following
"name","address"
"abc","\"canada\""
"xyz","\"US\""
To do that, use the doublequote = False
parameter
import csv
import pandas as pd
df = pd.read_excel("test.xlsx")
df.to_csv("test.csv", index=False, quoting=csv.QUOTE_ALL, doublequote = False, encoding="utf-8", escapechar = '\\')
If you're sure what you posted as your desired output is really what you need, I'm not sure there's a way to do so.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.