简体   繁体   English

StringIO的'rU'模式(在未引用字段中看到的换行符)

[英]'rU' mode for StringIO (new-line character seen in unquoted field)

I am trying to parse gzip files line by line : 我试图逐行解析gzip文件:

with gzip.open(obj.get()['Body'])as f:

    for line in f:
        line=StringIO(line.decode("utf-8"))
        line=csv.reader(line,delimiter=',')

        for line1 in line:

         #some logic

But for some of the files I have error: 但对于一些文件我有错误:

new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

When I try to open in newlline mode: 当我尝试在newlline模式下打开时:

csv.reader(open(line, 'rU'), delimiter=',')

I have: 我有:

expected str, bytes or os.PathLike object, not _io.StringIO

I want all fields, which contain '\\r' to be in that field as part of string value. 我希望包含'\\ r'的所有字段都在该字段中作为字符串值的一部分。 How this can be resolved? 如何解决这个问题?

Something like this, which avoids using the csv.reader and StringIO modules: 像这样的东西,它避免使用csv.readerStringIO模块:

with gzip.open(obj.get()['Body'])as f:
    for line in f:
        line = line.strip()
        line = line.decode("utf-8").split(',')

        for line1 in line:
            #some logic

根据https://docs.python.org/3.7/library/io.html?highlight=io#io.StringIO如果您传递第二个参数为None,它应该识别所有换行符

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM