简体   繁体   English

python中的CSV模块 - 换行问题

[英]CSV module in python - issue with newline

I have a csv file with data as: 我有一个csv文件,其数据为:

"field1"|"field2"|"field3"
"12ed"|"ksdk"|"sjdhs"
"1323"|"jdjsk
sjfsk"|"sk"k"sd"

My expected output 我的预期产量

field1|field2|field3
12ed|ksdk|sjdhs
1323|jsjsk sjfsk|sk"k"sd

My two issues are in line 3. Where the data contains double quotes in double quoted csv file which it should return in the final output. 我的两个问题在第3行。其中数据在双引号csv文件中包含双引号,它应该在最终输出中返回。 And the new line/line break in the value of a column. 并且新行/行会中断列的值。 All found in line 3. 全部见于第3行。

Since I read the data as "QUOTE_NONE", I'm able to return [1:-1] data but not able to replace new line with empty value. 由于我将数据读作“QUOTE_NONE”,我能够返回[1:-1]数据,但无法用空值替换新行。

with open(fileIn, "rb") as input:
    with open(fileOut,'wb') as output:
        w = csv.writer(output, delimiter='|',quoting=csv.QUOTE_NONE,quotechar='')
        for record in csv.reader(input, delimiter='|',quoting=csv.QUOTE_NONE):
            #r = map(lambda x: x.replace("\n",""), record) --> This is not working
            print([s[1:-1] for s in record])
            w.writerow([s[1:-1] for s in record])

Using this code, I'm able to handle quotes (first & last) and keep quotes in data. 使用此代码,我能够处理引号(第一个和最后一个)并在数据中保留引号。 But I'm not able to handle newline. 但是我无法处理新行。

Updated - 更新 -

The csv file contents :- csv文件内容: -

"id"|"comments"|"Date"
"B-7"|"Hi How . 


Are You."|"2017-03-15 13:53:23.727"
"8-C"|"How was "your day" today"|"2017-02-06 11:45:26.783"

The Error :- 错误 :-

['"id"', '"comments"', '"Date"']
['"B-7"', '"Hi How . ']
[]
Traceback (most recent call last):
File "try.py", line 23, in <module>
appendRecords(record, oldRecord)
File "try.py", line 8, in appendRecords
oldRecord[-1] = oldRecord[-1] + ' ' + record[0]
IndexError: list index out of range

FYI - Im using version 2.6.6 仅供参考 - 我正在使用2.6.6版

One option is to add a check that if the last column of a row does not end with " then don't write it to the output file instead merge the next row to it and then write it to the output file. 一种选择是添加一个检查,如果一行的最后一列没有以"然后不将其写入输出文件而是将下一行合并到它,然后将其写入输出文件”。

Merge is a list.extend except that the last element of first list and first element of last list also get concatenated. Merge是一个list.extend除了第一个列表的最后一个元素和最后一个列表的第一个元素也被连接起来。

This code should work for you: 此代码应该适合您:

def appendRecords(record, oldRecord):
    # Check to guard against empty lines in the input csv file
    if len(record):
        oldRecord[-1] = oldRecord[-1] + ' ' + record[0]
        record.pop(0)
        oldRecord.extend(record)



with open(fileIn, "rb") as input:
    with open(fileOut,'wb') as output:
        w = csv.writer(output, delimiter='|',quoting=csv.QUOTE_NONE,quotechar='')
        oldRecord = None
        for record in csv.reader(input, delimiter='|',quoting=csv.QUOTE_NONE):
            if oldRecord is not None:
                appendRecords(record, oldRecord)
                record = oldRecord

            if record[-1].endswith('"'):
                print([s[1:-1] for s in record])
                w.writerow([s[1:-1] for s in record])
                oldRecord = None
            else:
                oldRecord = record

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM