[英]Importing csv file with line breaks to R or Python Pandas
I have a csv file that includes line breaks within columns: 我有一个csv文件,其中包含列中的换行符:
"id","comment","x"
1,"ABC\"xyz",123
2,"xyz\"abc",543
3,"abc
xyz",483
ID 3, for example contains such a line break. ID 3,例如包含这样的换行符。
How can this be imported into python or R? 如何将其导入python或R? Also, I don't mind if those line breaks were to be replaced by a space, for example. 另外,我不介意这些换行符是否会被空格替换。
You can also use python pandas library read_csv function. 您还可以使用python pandas库read_csv函数。 Make sure to specify escape char. 确保指定escape char。
import pandas as pd
df = pd.read_csv('path_to_csv', sep=',', escapechar='\\')
Please note second backslash escaping first one. 请注意第二个反斜杠逃脱第一个。 It has nothing to do with pandas or csv. 它与pandas或csv无关。
Python has built-in CSV reader which handles that for you. Python内置了CSV阅读器,可以为您处理。 See csv documentation . 请参阅csv文档 。
import csv
with open(filename) as f:
reader = csv.reader(f)
csv_rows = list(reader)
the problem seemed to be not the line breaks, but rather the escaped upper quotes within the columns: \\"
. 问题似乎不是换行符,而是列中的转义上限引号: \\"
。
Python: zvone's answer worked fine! Python:zvone的答案运行良好!
import csv
with open(filename) as f:
reader = csv.reader(f)
csv_rows = list(reader)
R: readr::read_csv
worked without having to change any of the defaults. R: readr::read_csv
无需更改任何默认值即可运行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.