简体   繁体   中英

Pandas Converting xlsx to CSV - escapechar not working

I am trying to use Pandas to convert my xlsx file to CSV. Some of the data has double quotes and so I am using escape character. However, it doesn't seem to work.

test.xslx

在此处输入图像描述

test.csv (output)

"name"|"address"
"abc"|"""canada"""
"xyz"|"""US"""

expected output

"name"|"address"
"abc"|\""canada"\"
"xyz"|\""US"\"

convert.py

import csv
import pandas as pd
df = pd.read_excel("NLP_data.xlsx")
df = pd.read_excel("test.xlsx")
df.to_csv("test.csv", sep="|",  index=False, quoting=csv.QUOTE_ALL, encoding="utf-8", escapechar='\\')

What's the purpose of escape character? Shouldn't it escape double quotes as it's part of data?

Are you sure your expected output is correct? I would expect it to look like the following

"name","address"
"abc","\"canada\""
"xyz","\"US\""

To do that, use the doublequote = False parameter

import csv
import pandas as pd
df = pd.read_excel("test.xlsx")
df.to_csv("test.csv",  index=False, quoting=csv.QUOTE_ALL, doublequote = False, encoding="utf-8", escapechar = '\\')

If you're sure what you posted as your desired output is really what you need, I'm not sure there's a way to do so.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM