簡體   English   中英

如何展平具有非常量屬性的字典列表

[英]How to flatten a list of dictionaries with non-constant attributes

直到最近,我還使用下面相對簡單的代碼從 API 獲取位置數據,展平響應並轉換為展平數據框/表格。 它工作得很好,因為鍵“ExtendedAttributes”返回一組恆定的值並以恆定的順序。 API 架構在一夜之間發生了變化(沒有警告),現在根據每個位置的屬性返回不同的值。

將嵌套的字典列表展平為表格並重命名列的更簡單的解決方案不再有效。 我不確定如何處理這樣的案例或首先嘗試什么方法。

腳本

# Fetch the initial location description list
locationDescriptions = timeseries.publish.get('/GetLocationDescriptionList')['LocationDescriptions']

#Loop through provisioning API to get full location info for every UniqueID
locations = timeseries.provisioning.send_batch_requests('/locations/{Id}', [{'LocationUniqueId': loc['UniqueId']} for loc in locationDescriptions])

#API response in 'locations' is nested JSON so we need to unpack/flatten it.  
dic_flattened = [flatten(d) for d in locations]

df_flat =pd.DataFrame(dic_flattened)

#give value columns matching names
df_flat.rename(columns = {'ExtendedAttributeValues_0_Value' : 'COUNTY' ...}, inplace = true)

第一個位置

 "ExtendedAttributeValues": [
{
  "ColumnIdentifier": "COUNTY@LOCATION_EXTENSION",
  "Value": "Okaloosa - FL",
  "UniqueId": "538e05b45a9a4b31a46cf96c4ffab8cb"
},
{
  "ColumnIdentifier": "GW_REGION@LOCATION_EXTENSION",
  "Value": "Western Panhandle Embayment Region",
  "UniqueId": "5f51ebde984c4bdd92dff067cbe5b39b"
},
{
  "ColumnIdentifier": "LAND_NET@LOCATION_EXTENSION",
  "Value": "S016T3NR22W",
  "UniqueId": "8d8139c9027a497f9cae4ef7471930ba"
}

第二個位置(屬性不再匹配)

"ExtendedAttributeValues": [
{
  "ColumnIdentifier": "DATA_USED@GW_EXTENSION",
  "Value": "",
  "UniqueId": "dace52af725b42a9aa63aa8e1b9a1b74"
},
{
  "ColumnIdentifier": "TOP_BUCATUNA@GW_EXTENSION",
  "Value": "",
  "UniqueId": "352e5763d90748a490b32ba833a65d1c"
},
{
  "ColumnIdentifier": "TOP_FLORIDAN@GW_EXTENSION",
  "Value": "",
  "UniqueId": "b940292e63e84214ab785584f420674b"
}

展平現在會產生如下表格:

ExtendedAttributeValues_0_Value ExtendedAttributeValues_1_ColumnIdentifier ExtendedAttributeValues_1_Value ExtendedAttributeValues_2_ColumnIdentifier ExtendedAttributeValues_2_Value
COUNTY@LOCATION_EXTENSION 奧卡盧薩 - 佛羅里達州 GW_REGION@LOCATION_EXTENSION 西部狹長海灣地區 LAND_NET@LOCATION_EXTENSION
DATA_USED@GW_EXTENSION TOP_BUCATUNA@GW_EXTENSION TOP_FLORIDAN@GW_EXTENSION

但我想將每個“ColumnIdentifier”轉換為列名,並用關聯的“Value”填充該列的行:

DATA_USED GW_REGION TOP_BUCATUNA LAND_NET TOP_佛羅里達
奧卡盧薩 - 佛羅里達州 西部狹長海灣地區 S016T3NR22W
data = {
     "ExtendedAttributeValues": [
    {
      "ColumnIdentifier": "COUNTY@LOCATION_EXTENSION",
      "Value": "Okaloosa - FL",
      "UniqueId": "538e05b45a9a4b31a46cf96c4ffab8cb"
    },
    {
      "ColumnIdentifier": "GW_REGION@LOCATION_EXTENSION",
      "Value": "Western Panhandle Embayment Region",
      "UniqueId": "5f51ebde984c4bdd92dff067cbe5b39b"
    },
    {
      "ColumnIdentifier": "LAND_NET@LOCATION_EXTENSION",
      "Value": "S016T3NR22W",
      "UniqueId": "8d8139c9027a497f9cae4ef7471930ba"
    }
    ]
}

data2 = {
    "ExtendedAttributeValues": [
    {
      "ColumnIdentifier": "DATA_USED@GW_EXTENSION",
      "Value": "",
      "UniqueId": "dace52af725b42a9aa63aa8e1b9a1b74"
    },
    {
      "ColumnIdentifier": "TOP_BUCATUNA@GW_EXTENSION",
      "Value": "",
      "UniqueId": "352e5763d90748a490b32ba833a65d1c"
    },
    {
      "ColumnIdentifier": "TOP_FLORIDAN@GW_EXTENSION",
      "Value": "",
      "UniqueId": "b940292e63e84214ab785584f420674b"
    }
    ]
}

給定任一字典,我們可以簡單地使用pd.json_normalize讀取它們:

df1 = pd.json_normalize(data, 'ExtendedAttributeValues')
df2 = pd.json_normalize(data2, 'ExtendedAttributeValues')

輸出:

  • df1
               ColumnIdentifier                               Value                          UniqueId
0     COUNTY@LOCATION_EXTENSION                       Okaloosa - FL  538e05b45a9a4b31a46cf96c4ffab8cb
1  GW_REGION@LOCATION_EXTENSION  Western Panhandle Embayment Region  5f51ebde984c4bdd92dff067cbe5b39b
2   LAND_NET@LOCATION_EXTENSION                         S016T3NR22W  8d8139c9027a497f9cae4ef7471930ba
  • df2
            ColumnIdentifier Value                          UniqueId
0     DATA_USED@GW_EXTENSION        dace52af725b42a9aa63aa8e1b9a1b74
1  TOP_BUCATUNA@GW_EXTENSION        352e5763d90748a490b32ba833a65d1c
2  TOP_FLORIDAN@GW_EXTENSION        b940292e63e84214ab785584f420674b

如果您有比這更大的 JSON 響應,例如舊Value現在存儲在更高級別,您可以查看pd.json_normalize文檔以了解如何提取該信息,或使用足夠的信息更新您的問題以實際回答你的問題。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM