I have data to read in.csv file which looks like:
col1,col2,col3,col4,col5
"val1,val2,""{'key1': 'value1', 'key2': 'value2'}"",val4,val5"
"va11,val12,""{'key11': 'value11', 'key12': 'value12'}"",val14,val15"
I've tried import this file via pandas in many ways but always get an error. Is it easy way to do this with pandas?
These rows look like valid CSV rows that were then put through a CSV writer again. That second pass turned the row into a single column by adding quotes to escape the commas and quotes in the already-CSV'd file. You could reverse that process to load the CSV. Or fix the writer which is the real source of the bug.
import csv
import pandas as pd
import io
unmangled = io.StringIO()
with open("test.csv", newline="") as infile:
# header is unmanagled, so just write it
unmangled.write(next(infile))
# read the CSV - the first column is a CSV encoded CSV row
unmangled.writelines(row[0] + "\n" for row in csv.reader(infile))
# rewind and read the unmangled CSV
unmangled.seek(0)
df = pd.read_csv(unmangled)
print(df)
Output
col1 col2 col3 col4 col5
0 val1 val2 {'key1': 'value1', 'key2': 'value2'} val4 val5
1 va11 val12 {'key11': 'value11', 'key12': 'value12'} val14 val15
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.