简体   繁体   中英

Filling a dataframe from a dictionary keys and values

I have the following dataframe as an example.

df_test = pd.DataFrame(data=None, index=["green","yellow","red","pink"], columns=["bear","dog","cat"], dtype=None, copy=False)

在此输入图像描述

I have the following dictionary with keys and values that are the same or related to the index and columns od my dataframe.

d = {"green":["bear","dog"], "yellow":["bear"], "red":["bear"]}

I would like to fill my dataframe according with the keys and values that are presented and if the key is not present I want to fill with empty.

Desired output

在此输入图像描述

I only can think about making lists and looping for. Is there an easy way to implement this? or functions that can help me?

Use loopd by dictionary and set True values, then replace all missing rows by by mask with Empty and last replace missing values by fillna :

for k, v in d.items():
    for x in v:
        df_test.loc[k, x] = 'Yes'

df_test = df_test.mask(df_test.isnull().all(axis=1), 'Empty').fillna('No')
print (df_test)
         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty

You can achieve what you want by doing:

# You can use elements that are not in the original dataframe
# and the row will be filled with empty

index_list = ["green", "yellow", "red", "pink", "purple"]

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty

Explanation

You can accomplish what you want by checking whether the columns of the dataframe are present in the corresponding field of the dict:

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
    index=x.index), axis=1)

        bear    dog    cat
green   True   True  False
yellow  True  False  False
red     True  False  False

And then reindex according to the keys of the dict to fill find missing colors and fill them with empty:

index_list = ["green","yellow","red","pink", "purple"]

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
       index=x.index), axis=1).reindex(index_list)

        bear    dog    cat
green   True   True  False
yellow  True  False  False
red     True  False  False
pink     NaN    NaN    NaN
purple   NaN    NaN    NaN

Then, if you want to change the values you can replace them by using a dictionary like this:

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty

Here's a largely vectorised solution via pd.get_dummies and pd.DataFrame.reindex :

df = pd.DataFrame.from_dict(d, orient='index')

res = pd.get_dummies(df.reindex(df_test.index), prefix='', prefix_sep='')\
        .reindex(columns=df_test.columns)\
        .fillna(0).applymap({0: 'No', 1: 'Yes'}.get)\
        .reindex(index=np.hstack((df_test.index, df.index.difference(df_test.index))))\
        .fillna('Empty')

print(res)

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM