JSON DataFrame 的包装，列中的每个唯一 ID

Question

The sample dataframe, df样品dataframe，df

SITE_ID     PRO_ID   PRO_ID     TXN_ID      LINE_ID         INST_ID     QUOTE       N1  N2  N3  R1  R2  R3
93672863K   PR_I     T_ID_PORT  283747E11   439329095       254553919   DISCOUNT    1   2   3   6   8   9
93672863K   PR_PI    T_PIP_COS  283747E12   8123619000      200613005   DISCOUNT    2   7   3   3   6   7
93672863K   PR_PI    T_PIP_PORT 283747E13   8123618999      200613003   DISCOUNT    6   5   9   1   5   9
93672863K   PR_PI    T_PIP_PORT 283747E14   8123618999      200613003   DISCOUNT    3   5   7   7   5   3
93672863K   PR_I     T_ID_PORT  283747E11   439329095       254553919   N-DISCOUNT  1   2   3   4   8   6
93672863K   PR_PI    T_PIP_COS  283747E12   8123619000      200613005   N-DISCOUNT  2   7   3   1   5   3
93672863K   PR_PI    T_PIP_PORT 283747E13   8123618999      200613003   N-DISCOUNT  6   5   9   8   4   2
93672863K   PR_PI    T_PIP_PORT 283747E14   8123618999      200613003   N-DISCOUNT  3   5   7   6   8   4

I'm trying to pack the dataframe to a JSON file using the below code我正在尝试使用以下代码将 dataframe 打包到 JSON 文件

JSON_Dict = {"siteID": df.SITE_ID[0],
              "status": 1,
              "Message": None}
    
detail_LIST = []
        
for i in range(0, df.nunique()):
     detail_Dict_i = {"instID": df.INST_ID[i],
                       "ItemID": df.LINE_ID[i],
                       "opticount": [
                                      #N
                                     {"N1": df.N1[i],
                                      "N2": df.N2[i]
                                      "N3": df.N3[i]}

                                      #R
                                     {"R1": df.R1[i],
                                      "R2": df.R2[i]
                                      "R3": df.R3[i]}

                                    ]
                      }
                                
        
    detail_LIST.append(detail_Dict_i)

JSON_Dict["InstDetail"] = detail_LIST

This worked for me for the first four rows and without QUOTE column.这对前四行和没有QUOTE列的我有用。

When the Quote column was added, the R1,R2,R3 changes with the Quote type.添加报价列后， R1,R2,R3随报价类型而变化。 Now I'm trying include the R-Type columns with the both QUOTE types.现在我正在尝试将R-Type列包含在两种QUOTE类型中。 As shown in the below如下图所示

{
"siteID": 93672863K,
 "status": 1,
 "Message": None,

"InstDetail" :[
                {"instID": 254553919,
                  "ItemID": 439329095,
                  "opticount": [
                                 #N
                                {"N1": 1,
                                 "N2": 2,
                                 "N3": 3},

                                 #R with DISCOUNT
                                {"Quote": "DISCOUNT",
                                "R1": 6,
                                 "R2": 8,
                                 "R3": 9},

                                 #R with N -DISCOUNT
                                {"Quote": "N-DISCOUNT",
                                "R1": 4,
                                 "R2": 8,
                                 "R3": 6}

                                ]
                 },

                 {"instID": 200613005,
                  "ItemID": 8123619000,
                  "opticount": [
                                 #N
                                {"N1": 2,
                                 "N2": 7,
                                 "N3": 3},

                                 #R with DISCOUNT
                                {"Quote": "DISCOUNT",
                                "R1": 3,
                                 "R2": 6,
                                 "R3": 7},

                                 #R with N -DISCOUNT
                                {"Quote": "N-DISCOUNT",
                                "R1": 1,
                                 "R2": 5,
                                 "R3": 3}

                                ]
                 }, # other records


]


 }

I cound't find the logic to pack the JSON with each INST_ID with N columns and R columns with the QUOTE type.我找不到将 JSON 与每个INST_ID与N列和R列与QUOTE类型打包的逻辑。

I'm open to new ideas and approaches.我乐于接受新的想法和方法。

Answer 1

You can do it easily using groupby() :您可以使用groupby()轻松完成：

result = {
    'siteID': df.SITE_ID[0],
    'status': 1,
    'Message': None,
    'InstDetail': [],
}

for items, group_df in df.groupby(['INST_ID', 'LINE_ID', 'N1', 'N2', 'N3']):
    inst_id, line_id, n1, n2, n3 = items
    detail = {
        'instID': inst_id,
        'ItemID': line_id,
        'opticount': [{
            'N1': n1,
            'N2': n2,
            'N3': n3,
        }]
    }

    for rec in group_df.to_dict('records'):
        detail['opticount'].append({
            'Quote': rec['QUOTE'],
            'R1': rec['R1'],
            'R2': rec['R2'],
            'R3': rec['R3'],
        })

    result['InstDetail'].append(detail)

print(result)

JSON DataFrame 的包装，列中的每个唯一 ID

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-10-09 12:36:30

JSON DataFrame 的包装，列中的每个唯一 ID

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-10-09 12:36:30

解决方案1
1 已采纳 2020-10-09 12:36:30