简体   繁体   English

使用换行符将csv文件导入R或Python Pandas

[英]Importing csv file with line breaks to R or Python Pandas

I have a csv file that includes line breaks within columns: 我有一个csv文件,其中包含列中的换行符:

"id","comment","x"
1,"ABC\"xyz",123
2,"xyz\"abc",543
3,"abc
xyz",483

ID 3, for example contains such a line break. ID 3,例如包含这样的换行符。

How can this be imported into python or R? 如何将其导入python或R? Also, I don't mind if those line breaks were to be replaced by a space, for example. 另外,我不介意这些换行符是否会被空格替换。

You can also use python pandas library read_csv function. 您还可以使用python pandas库read_csv函数。 Make sure to specify escape char. 确保指定escape char。

import pandas as pd
df = pd.read_csv('path_to_csv', sep=',', escapechar='\\')

Please note second backslash escaping first one. 请注意第二个反斜杠逃脱第一个。 It has nothing to do with pandas or csv. 它与pandas或csv无关。

Python has built-in CSV reader which handles that for you. Python内置了CSV阅读器,可以为您处理。 See csv documentation . 请参阅csv文档

import csv

with open(filename) as f:
    reader = csv.reader(f)
    csv_rows = list(reader)

the problem seemed to be not the line breaks, but rather the escaped upper quotes within the columns: \\" . 问题似乎不是换行符,而是列中的转义上限引号: \\"

Python: zvone's answer worked fine! Python:zvone的答案运行良好!

import csv

with open(filename) as f:
    reader = csv.reader(f)
    csv_rows = list(reader)

R: readr::read_csv worked without having to change any of the defaults. R: readr::read_csv无需更改任何默认值即可运行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM