簡體   English   中英

Unnest dict 到唯一鍵值對

[英]Unnest dict to unique key value pairs

我想使用to_dict()方法生成兩列 dataframe。 這里的目標是將字典轉換為所有唯一的鍵值對,例如:

{'83d945fffffffff': {'83d940fffffffff',
  '83d941fffffffff',
  '83d944fffffffff',
  '83d963fffffffff',
  '83d96afffffffff',
  '83d96efffffffff'},
 '83bcf2fffffffff': {'83bc8dfffffffff', '83bcf6fffffffff'}...

應該成為

                 k                v
0  83d945fffffffff  83d940fffffffff
1  83d945fffffffff  83d941fffffffff
2  83d945fffffffff  83d944fffffffff
3  83d945fffffffff  83d963fffffffff
4  83d945fffffffff  83d96afffffffff
5  83d945fffffffff  83d96efffffffff
6  83bcf2fffffffff  83bc8dfffffffff
7  83bcf2fffffffff  83bcf6fffffffff

但是,指定orient='index'不會提供此結果,而是會創建 NoneType 單元格:

    0                1              2               3               4                5                    6       
    83d945fffffffff 83d96efffffffff 83d963fffffffff 83d941fffffffff 83d940fffffffff 83d944fffffffff 83d96afffffffff
    83bcf2fffffffff 83bc8dfffffffff 83bcf6fffffffff None    None    None    None

是否有已知的解決方法或有效方法可以直接從字典中生成雙列 dataframe?

在您的 dict 上使用d.items()pd.DataFrame

df = pd.DataFrame(d.items(), columns=['k', 'v']).explode('v').reset_index(drop=True)
print(df)

# Output
                 k                v
0  83d945fffffffff  83d963fffffffff
1  83d945fffffffff  83d96afffffffff
2  83d945fffffffff  83d941fffffffff
3  83d945fffffffff  83d940fffffffff
4  83d945fffffffff  83d944fffffffff
5  83d945fffffffff  83d96efffffffff
6  83bcf2fffffffff  83bcf6fffffffff
7  83bcf2fffffffff  83bc8dfffffffff

設置:

d = {'83d945fffffffff': {'83d940fffffffff',
  '83d941fffffffff',
  '83d944fffffffff',
  '83d963fffffffff',
  '83d96afffffffff',
  '83d96efffffffff'},
 '83bcf2fffffffff': {'83bc8dfffffffff', '83bcf6fffffffff'}}

這是一個快速而骯臟的嵌套循環解決方案。

import pandas as pd
d = {'83d945fffffffff': {'83d940fffffffff',
  '83d941fffffffff',
  '83d944fffffffff',
  '83d963fffffffff',
  '83d96afffffffff',
  '83d96efffffffff'},
 '83bcf2fffffffff': {'83bc8dfffffffff', '83bcf6fffffffff'}}

k,v, = [],[]
for ki,vi in d.items():
    for vii in set(vi):
        k.append(ki)
        v.append(vii)
df = pd.DataFrame({'k':k,'v':v})

如果你想讓它更漂亮,你可以把它放在 listcomp 中:

d2 = {'k':[],'v':[]}
_ = [[(d2['k'].append(k),d2['v'].append(vi)) for vi in set(v)] for k,v in d.items()]
df = pd.DataFrame(d2)
df

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM