简体   繁体   English

如何基于逗号拆分包含多个字符串值的 csv 行,但不考虑大括号内的逗号 { }

[英]How to split csv rows containing multiple string values based on comma but without considering comma inside curly brackets { }

I am reading one csv file and trying to split its rows based on comma, Here in my case rows containing some values having comma as a part of that value and that value start and ends with { }.我正在读取一个 csv 文件并尝试根据逗号拆分其行,在我的示例中,行包含一些具有逗号作为该值一部分的值,并且该值以 {} 开头和结尾。

My Split function:我的拆分功能:

    def process(self, row):
        """
        Splits each row on commas
        """
        Uid, controlNo, profileType, LAStpointDetail, LastPointDate = 
                                                         row.split(",")

My row example:我的行示例:

0923,41003,Permanent,{""Details"": [{""data"": {""status"": ""FAILURE"", ""id"": ""12345""}, ""DetailType"": ""Existing""}]},2019-06-27

In rows If you see "LAStpointDetail", it already contains multiple commas in it.在行中 如果您看到“LAStpointDetail”,则其中已包含多个逗号。 How do I split this whole row based on comma.如何根据逗号拆分整行。

What you seem to have here is csv data, in which one column is encoded as json .您在这里似乎拥有的是csv数据,其中一列被编码为json

It's not possible to tell exactly how the data is quoted from the the question (it would have been better to paste the repr of the row), but let's assume it's something like this:不可能确切地说明如何从问题中引用数据(最好粘贴行的代表),但让我们假设它是这样的:

'"0923","41003","Permanent","{""Details"": [{""data"": {""status"": ""FAILURE"", ""id"": ""12345""}, ""DetailType"": ""Existing""}]}","2019-06-27"' 

If that's true, and you have a file of this data containing multiple rows, you can use the csv module to read it:如果这是真的,并且您有一个包含多行数据的文件,您可以使用 csv 模块来读取它:

import csv
import json
with open('myfile.csv', 'rb') as f:
    reader = csv.reader(f)
    # Remove the next line if the first row is not the headers.
    # next(reader)    # Skip header row.
    for row in reader:
        Uid, controlNo, profileType, LAStpointDetail, LastPointDate = row
        # Load embedded json into a dictionary.
        detail_dict = json.loads(LAStpointDetail)
        # Do something with these values.

If you only have a single row as a string, you can still use the csv module:如果您只有一行作为字符串,您仍然可以使用 csv 模块:

>>> row = '"0923","41003","Permanent","{""Details"": [{""data"": {""status"": ""FAILURE"", ""id"": ""12345""}, ""DetailType"": ""Existing""}]}","2019-06-27"'
>>> # Make the data an element in a list ([]).
>>> reader = csv.reader([row])
>>> Uid, controlNo, profileType, LAStpointDetail, LastPointDate = next(reader)
>>> print Uid
0923
>>> d = json.loads(LAStpointDetail)
>>> d
{u'Details': [{u'DetailType': u'Existing', u'data': {u'status': u'FAILURE', u'id': u'12345'}}]}
>>> print d['Details'][0]['data']['status']
FAILURE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM