简体   繁体   English

使用 csv 中的 python 创建列表字典

[英]Create a dict of list using python from csv

I have a csv file with data as below我有一个 csv 文件,其数据如下

XPATH,ColumName,CSV_File_Name,ParentKey
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/id,id,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/attachments/attachment[]/id,aid,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/Internalid,Internalid,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/isDelete,FormId,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/fields/field[]/id,SupplierFormRecordFieldId,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/fields/field[]/value,SupplierFormRecordFieldValue,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/integrationTrackingNumber,integrationTrackingNumber,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityHeader/referenceCodeForEntity,referenceCodeForEntity,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/requestId,requestId,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/id,supplier_id,integrationEntityDetailsForms.csv,Y

sample csv file样品 csv 文件

I wanted to create a dictionary of list which would result like this basically split on [] and put all put the [0] on the first list for every element.我想创建一个列表字典,其结果基本上是在 [] 上拆分,并将 [0] 放在每个元素的第一个列表中。 discard the records which dont have [].丢弃没有[]的记录。 This will give the list of tag at each level.这将给出每个级别的标签列表。

{ 1 : ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form', 'integration-outbound:IntegrationEntity.integrationEntityHeader.attachments.attachment'] , 2 : ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'] , 3 : ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record.fields.field'] }

so far i have reached till splitting the string using [], converting the / to.到目前为止,我已经使用 [] 拆分字符串,将 / 转换为。 and the list are split up and accumulated.并且列表被拆分和累积。 But i am stuck in putting back to dictonary of list.但我坚持回到列表的字典。 Which will give me the level at which the each tag are这会给我每个标签的级别

df_process_sub_explode_Level gives the individual line for each row in that csv, but need to remove duplciates and popualte to the dict. df_process_sub_explode_Level为 csv 中的每一行提供单独的行,但需要删除重复项并填充到字典中。

CSV_File_Name = []
with open(process_config_csv, newline='') as csvfile:
        DataCaptured = csv.DictReader(csvfile)
        for row in DataCaptured:
                if row['CSV_File_Name'] not in CSV_File_Name:
                        CSV_File_Name.append(row['CSV_File_Name'])

df_process = []
df_process_all_col = []
df_process_explode_Level = dict()
for items in CSV_File_Name:
        df_subset_process = []
        df_subset_list_all_cols = []
        with open(process_config_csv, newline='') as csvfile:
                DataCaptured = csv.DictReader(csvfile)
                for row in DataCaptured:
                    df_process_sub_explode_Level = []
                    if row['CSV_File_Name'] in items:
                            df_subset_process.append(row['XPATH'].replace("/",".").split('[]')[0].replace(".","",1))
                            df_subset_list_all_cols.append(row['XPATH'].replace("/",".").replace("[]","").replace(".","",1))
                            if "[]" in row['XPATH']:
                                print(row['XPATH'])
                                df_process_sub_explode_Level=row['XPATH'].replace("/",".").replace(".","",1).split('[]')
                                del df_process_sub_explode_Level[-1]
                                df_process_sub_explode_Level = list(accumulate(df_process_sub_explode_Level))
                                for explodeitems in range(len(df_process_sub_explode_Level)):
                                    df_process_explode_Level[explodeitems].append(df_process_sub_explode_Level[explodeitems])

Error:错误:

Traceback (most recent call last):
  File "<stdin>", line 17, in <module>
KeyError: 0

Please guide in putting back to the dict of list.请指导回到列表的字典。

Try this:尝试这个:

from csv import DictReader
from collections import defaultdict

with open('data.csv') as fp:
    csv_reader = DictReader(fp)
    data = [row['XPATH'].strip('/').replace('/', '.').split('[]') for row in csv_reader]

res = defaultdict(set)
for x in data:
    if len(x) > 1:
        res[len(x) -1].add(''.join(x[: -1]))
res = {k: list(v) for k, v in res.items()}
print(res)

Output: Output:

{1: ['integration-outbound:IntegrationEntity.integrationEntityHeader.attachments.attachment',
  'integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form'],
 2: ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'],
 3: ['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record.fields.field']}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM