從字典鍵和值填充數據框

Question

我有以下數據幀作為示例。

df_test = pd.DataFrame(data=None, index=["green","yellow","red","pink"], columns=["bear","dog","cat"], dtype=None, copy=False)

我有以下字典，其鍵和值與我的數據幀的索引和列相同或相關。

d = {"green":["bear","dog"], "yellow":["bear"], "red":["bear"]}

我想根據顯示的鍵和值填充我的數據幀，如果鍵不存在，我想填空。

期望的輸出

我只能考慮制作列表和循環。 有沒有簡單的方法來實現這個？ 或者可以幫助我的功能？

Answer 1

使用loopd by dictionary並設置True值，然后用mask替換所有缺失的行，使用Empty並用fillna替換缺失值：

for k, v in d.items():
    for x in v:
        df_test.loc[k, x] = 'Yes'

df_test = df_test.mask(df_test.isnull().all(axis=1), 'Empty').fillna('No')
print (df_test)
         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty

Answer 2

您可以通過以下方式實現您想要的目標：

# You can use elements that are not in the original dataframe
# and the row will be filled with empty

index_list = ["green", "yellow", "red", "pink", "purple"]

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty

說明

您可以通過檢查數據框的列是否存在於dict的相應字段中來完成您想要的任務：

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
    index=x.index), axis=1)

        bear    dog    cat
green   True   True  False
yellow  True  False  False
red     True  False  False

然后根據字典的鍵重新索引以填充找到缺失的顏色並用空填充它們：

index_list = ["green","yellow","red","pink", "purple"]

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
       index=x.index), axis=1).reindex(index_list)

        bear    dog    cat
green   True   True  False
yellow  True  False  False
red     True  False  False
pink     NaN    NaN    NaN
purple   NaN    NaN    NaN

然后，如果要更改值，可以使用如下字典替換它們：

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty

Answer 3

這是一個基於矢量化的解決方案，通過pd.get_dummies和pd.DataFrame.reindex ：

df = pd.DataFrame.from_dict(d, orient='index')

res = pd.get_dummies(df.reindex(df_test.index), prefix='', prefix_sep='')\
        .reindex(columns=df_test.columns)\
        .fillna(0).applymap({0: 'No', 1: 'Yes'}.get)\
        .reindex(index=np.hstack((df_test.index, df.index.difference(df_test.index))))\
        .fillna('Empty')

print(res)

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty

從字典鍵和值填充數據框

問題描述

3 個解決方案

解決方案1
2 已采納 2018-10-16 11:16:32

解決方案2
2 2018-10-16 12:10:39

解決方案3
1 2018-10-16 11:28:23

從字典鍵和值填充數據框

問題描述

3 個解決方案

解決方案1 2 已采納 2018-10-16 11:16:32

解決方案2 2 2018-10-16 12:10:39

解決方案3 1 2018-10-16 11:28:23

解決方案1
2 已采納 2018-10-16 11:16:32

解決方案2
2 2018-10-16 12:10:39

解決方案3
1 2018-10-16 11:28:23