简体   繁体   中英

How do I define a function to extract values from nested dictionary for each row in python

I have a column named 'urls' in dataframe 'df' that each row consists of nested dictionaries with a URL and whether it is malicious or not. I'd like to extract only the value of the nested dictionary for each row.

0    {'url example 1': {'malicious': False}}
1    {'url example 2': {'malicious': False}}  

By defining a function, I'd like to use 'apply' function to get the result for each row.

Here's the sample function that I have defined.

def urlconcern(url):
    try:
        r = s.lookup_urls([url]) 
        return r.values()
    except:
        pass

After running this with 'apply' function

df['urls'].apply(urlconcern)

This only gives the result below with round bracket (strangely)

0    ({'malicious': False})
1    ({'malicious': False})

The desired answer would be

False
False

Could there be any way to do so?

Given pandas series s (I'm assuming it's a pandas series)

s = pd.Series([{'url example 1': {'malicious': False}},
               {'url example 2': {'malicious': False}}])

you can use generator expression inside next to look for values of nested dicts.

out = s.apply(lambda url: next((v for d in url.values() for k,v in d.items()), None))

Output:

0    False
1    False
dtype: bool

However, I'm not convinced this is what you're looking for since you're losing the url info here.

Is this a pandas dataframe? Did you instantiate it? You may want to look at how this dictionary is constructed because it should be more like

>>> df = {'url':['url example 1', 'url example 2', 'url example 3'], 'malicious': [False, False, True]}
>>> df = pd.DataFrame(df)
>>> df
             url  malicious
0  url example 1      False
1  url example 2      False
2  url example 3       True

Then do

>>> df[df['malicious'] == False]
             url  malicious
0  url example 1      False
1  url example 2      False

I know this doesn't answer your question exactly, but it's a standard way of working with DataFrames and should help your workflow later down the line.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM