简体   繁体   中英

string function in Python to extract between two characters

I have the below string and I want to extract everything from <img... to the closing " after .jpg .

I tried the below, but it doesn't find just the first " but rather the very end.

Can anyone help?

In [14]: start = 'img src="'
In [15]: end = '"'
print string[string.find(start)+len(start):string.rfind(end)]

STRING:

 <p><a href="https://news.yahoo.com/us-ambassador-takes-post-united-nations-141833297.html"><img src="http://l1.yimg.com/uu/api/res/1.2/1f8jyGM.NfkxLb_.OgMaIQ--/YXBwaWQ9eXRhY2h5b247aD04Njt3PTEzMDs-/http://media.zenfs.com/en_us/News/afp.com/f5bbc19135065fcfff40e6ece9650f4ab225fa97.jpg" width="130" height="86" alt="New US ambassador takes up post at United Nations" align="left" title="New US ambassador takes up post at United Nations" border="0" ></a>US Ambassador Kelly Craft took up her post at the United Nations on Thursday, vowing to defend America's values and interests nine months after the departure of her high-profile predecessor Nikki Haley. Craft, 57, served previously as US ambassador to Canada where she was involved in negotiations on a new US Mexico Canada free trade agreement.<p><br clear="all">

You can use Regex like this, if you are sure it would be always same.

<img.*?jpg\\"

Here is the link for it, Regex101 You can tweak as you want though depending upon your requirements. Regex is the right tool for it instead of sting find and len and all that.

You could just use the .split() function, if you don't want to use a reg ex.

str = """<p><a href="https://news.yahoo.com/us-ambassador-takes-post-united-nations-141833297.html"><img src="http://l1.yimg.com/uu/api/res/1.2/1f8jyGM.NfkxLb_.OgMaIQ--/YXBwaWQ9eXRhY2h5b247aD04Njt3PTEzMDs-/http://media.zenfs.com/en_us/News/afp.com/f5bbc19135065fcfff40e6ece9650f4ab225fa97.jpg" width="130" height="86" alt="New US ambassador takes up post at United Nations" align="left" title="New US ambassador takes up post at United Nations" border="0" ></a>US Ambassador Kelly Craft took up her post at the United Nations on Thursday, vowing to defend America's values and interests nine months after the departure of her high-profile predecessor Nikki Haley. Craft, 57, served previously as US ambassador to Canada where she was involved in negotiations on a new US Mexico Canada free trade agreement.<p><br clear="all">"""


#final should just be the url
final = str.split("img src=\"")[1].split("\" width=")[0]

print(final)

Output:

http://l1.yimg.com/uu/api/res/1.2/1f8jyGM.NfkxLb_.OgMaIQ--/YXBwaWQ9eXRhY2h5b247aD04Njt3PTEzMDs-/http://media.zenfs.com/en_us/News/afp.com/f5bbc19135065fcfff40e6ece9650f4ab225fa97.jpg

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM