![](/img/trans.png)
[英]How to parse JSON files with double-quotes inside strings in Python?
[英]Python parse CSV ignoring comma with double-quotes
我有一個帶有如下行的 CSV 文件:
"AAA", "BBB", "Test, Test", "CCC"
"111", "222, 333", "XXX", "YYY, ZZZ"
等等 ...
我不想在雙引號下解析逗號。 IE。 我的預期結果應該是
AAA
BBB
Test, Test
CCC
我的代碼:
import csv
with open('values.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
print row
我嘗試在 python 下使用 csv 包,但沒有運氣。 解析會分解所有逗號。
如果我遺漏了什么,請告訴我
應該這樣做:
lines = '''"AAA", "BBB", "Test, Test", "CCC"
"111", "222, 333", "XXX", "YYY, ZZZ"'''.splitlines()
for l in csv.reader(lines, quotechar='"', delimiter=',',
quoting=csv.QUOTE_ALL, skipinitialspace=True):
print l
>>> ['AAA', 'BBB', 'Test, Test', 'CCC']
>>> ['111', '222, 333', 'XXX', 'YYY, ZZZ']
輸入中的引號字符前有空格。 將skipinitialspace
設置為True
可以跳過定界符之后的任何空格:
如果為
True
,則分隔符之后的空白將被忽略。 默認值為False
。
>>> import csv
>>> lines = '''\
... "AAA", "BBB", "Test, Test", "CCC"
... "111", "222, 333", "XXX", "YYY, ZZZ"
... '''
>>> reader = csv.reader(lines.splitlines())
>>> next(reader)
['AAA', ' "BBB"', ' "Test', ' Test"', ' "CCC"']
>>> reader = csv.reader(lines.splitlines(), skipinitialspace=True)
>>> next(reader)
['AAA', 'BBB', 'Test, Test', 'CCC']
[發布編輯更清晰。] 如果您不想在雙引號下解析逗號,因此您的輸出將包含列內的逗號,這是另一種方法。 它很優雅,並允許您使用雲存儲桶來存儲您的 CSV 文件。 關鍵是使用 [smart_open][1] 作為標准文件打開的替代品。
另外,我使用 [DictReader][2] 而不是閱讀器。
import csv
import json
from smart_open import open
with open('./temp.csv') as csvFileObj:
reader = csv.DictReader(csvFileObj, delimiter=',', quotechar='"')
# csv.reader requires bytestring input in python2, unicode input in python3
for record in reader:
# record is a dictionary of the csv record
print(f'Record as json shows proper reading of file:\n {json.dumps(record, indent=4)})')
print(f'You can reference an individual field too: {record["field3"]}')
print(f' {record["field4"]}')
請注意,我向 DictReader 添加了 2 個參數。 delimiter=',', quotechar='"' 逗號是默認分隔符,但我添加了它以防有人需要更改它。 Quotechar 是必要的,因為它不是默認值。代碼的實際輸出:
Record as json shows proper reading of file:
{
"field1": "AAA",
"field2": "BBB",
"field3": "Test, Test",
"field4": "CCC"
})
You can reference an individual field too: Test, Test
CCC
done
Record as json shows proper reading of file:
{
"field1": "111",
"field2": "222, 333",
"field3": "XXX",
"field4": "YYY, ZZZ"
})
You can reference an individual field too: XXX
YYY, ZZZInput file:
輸入數據文件(為了清楚起見,我添加了一個標題記錄。如果您沒有標題記錄,第一條記錄將被吞噬,但也有可能是一個參數。)
"field1","field2","field3","field4"
"AAA","BBB","Test, Test","CCC"
"111","222, 333","XXX","YYY, ZZZ"
我希望這可以幫助別人。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.