I'm trying to get csv writer to use a double quote as an escape character and add a double quote to a double quote if it appears in the data field.
My function is part of an Apache Beam Dataflow job.
Any advice would be appreciated.
The input record: "ab"c","def"
The actual output my function returns: abc", def
The output I'm trying to achieve "abc""", def
The input file may contain records like this:
1, "mystring1","mystring2"
2, "mystring3","mystring4"
3, "myst"ring5","mystring6"
Notice record 3 has a double quote in the field.
I would like to escape that double quote by adding
a double quote before it then quote the entire field.
1, mystring1,mystring2
2, mystring3,mystring4
3, "myst""ring5",mystring6
The function I'm calling
def parse_file(element):
for line in csv.reader([element], quotechar='"', delimiter=','):
output_str = io.StringIO()
cw = csv.writer(output_str, quotechar='"', delimiter=',', escapechar='"', quoting=csv.QUOTE_MINIMAL)
cw.writerow(line)
output_str.close()
clean_line = ', '.join(line)
return clean_line
Here is a simple solution which takes the input element of type string.
vec = str('"ab"c","def""')
print(list(map(lambda x: '"' + x + '"' if '""' in x else x, [y.strip('"').replace('"', '""') for y in vec.split(',')])))
If i understood something wong I apologies
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.