简体   繁体   中英

Remove unwanted info from a series in pandas dataframe

How do I extract the text in my images column from /images/ until .png

I have a pandas dataframe containing the following information

>>> animals

在此处输入图片说明

The column I want to manipulate is the image column

0     {'url': '/images/bengal-tiger_image.png', 'lic...
1     {'url': '/images/giant-panda_image.png', 'lice...
2     {'url': '/images/blue-whale_image.png', 'licen...
3     {'url': '/images/asian-elephant_image.png', 'l...
4     {'url': '/images/gorilla_image.png', 'licence'...
5     {'url': '/images/snow-leopard_image.png', 'lic...
6     {'url': '/images/orangutan_image.png', 'licenc...
7     {'url': '/images/sea-turtle_image.png', 'licen...
8     {'url': '/images/black-rhino_image.png', 'lice...
9     {'url': '/images/african-penguin_image.png', '...
10    {'url': '/images/red-panda_image.png', 'licenc...
11    {'url': '/images/polar-bear_image.png', 'licen...
Name: image, dtype: object

My current attempt is the following:

animals['image'] = animals.apply(lambda x: x['image'](len["/images/":]))

But this produces the following error:

KeyError: 'image'

Any suggestions welcome thanks

left_string = '/images/'
right_string = '.png'

animals['image_text'] = animals['image'].apply(lambda x: x['url'][len(left_string):len(x['url'])-len(right_string)])

Remember x is a dictionary so you need to use 'url' as the key.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM