Remove unwanted info from a series in pandas dataframe

Question

How do I extract the text in my images column from /images/ until .png

I have a pandas dataframe containing the following information

>>> animals

The column I want to manipulate is the image column

0     {'url': '/images/bengal-tiger_image.png', 'lic...
1     {'url': '/images/giant-panda_image.png', 'lice...
2     {'url': '/images/blue-whale_image.png', 'licen...
3     {'url': '/images/asian-elephant_image.png', 'l...
4     {'url': '/images/gorilla_image.png', 'licence'...
5     {'url': '/images/snow-leopard_image.png', 'lic...
6     {'url': '/images/orangutan_image.png', 'licenc...
7     {'url': '/images/sea-turtle_image.png', 'licen...
8     {'url': '/images/black-rhino_image.png', 'lice...
9     {'url': '/images/african-penguin_image.png', '...
10    {'url': '/images/red-panda_image.png', 'licenc...
11    {'url': '/images/polar-bear_image.png', 'licen...
Name: image, dtype: object

My current attempt is the following:

animals['image'] = animals.apply(lambda x: x['image'](len["/images/":]))

But this produces the following error:

KeyError: 'image'

Any suggestions welcome thanks

Answer 1

left_string = '/images/'
right_string = '.png'

animals['image_text'] = animals['image'].apply(lambda x: x['url'][len(left_string):len(x['url'])-len(right_string)])

Remember x is a dictionary so you need to use 'url' as the key.

Remove unwanted info from a series in pandas dataframe

Question

1 answers

solution1
1 ACCPTED 2020-10-21 19:00:20

Remove unwanted info from a series in pandas dataframe

Question

1 answers

solution1 1 ACCPTED 2020-10-21 19:00:20

solution1
1 ACCPTED 2020-10-21 19:00:20