I have a column named 'urls' in dataframe 'df' that each row consists of nested dictionaries with a URL and whether it is malicious or not. I'd like to extract only the value of the nested dictionary for each row.
0 {'url example 1': {'malicious': False}}
1 {'url example 2': {'malicious': False}}
By defining a function, I'd like to use 'apply' function to get the result for each row.
Here's the sample function that I have defined.
def urlconcern(url):
try:
r = s.lookup_urls([url])
return r.values()
except:
pass
After running this with 'apply' function
df['urls'].apply(urlconcern)
This only gives the result below with round bracket (strangely)
0 ({'malicious': False})
1 ({'malicious': False})
The desired answer would be
False
False
Could there be any way to do so?
Given pandas series s
(I'm assuming it's a pandas series)
s = pd.Series([{'url example 1': {'malicious': False}},
{'url example 2': {'malicious': False}}])
you can use generator expression inside next
to look for values of nested dicts.
out = s.apply(lambda url: next((v for d in url.values() for k,v in d.items()), None))
Output:
0 False
1 False
dtype: bool
However, I'm not convinced this is what you're looking for since you're losing the url info here.
Is this a pandas dataframe? Did you instantiate it? You may want to look at how this dictionary is constructed because it should be more like
>>> df = {'url':['url example 1', 'url example 2', 'url example 3'], 'malicious': [False, False, True]}
>>> df = pd.DataFrame(df)
>>> df
url malicious
0 url example 1 False
1 url example 2 False
2 url example 3 True
Then do
>>> df[df['malicious'] == False]
url malicious
0 url example 1 False
1 url example 2 False
I know this doesn't answer your question exactly, but it's a standard way of working with DataFrames and should help your workflow later down the line.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.