如何在python中的2個其他字符串之間提取字符串？

Question

就像我有一個像str1 = "IWantToMasterPython"這樣的字符串

如果我想從上面的字符串中提取"Py" 。 我寫：

extractedString = foo("Master","thon")

我想做所有這些，因為我試圖從HTML頁面中提取歌詞。 歌詞寫得像<div class = "lyricbox"> ....lyrics goes here....</div> 。

有關如何實施的任何建議。

Answer 1

解決方案是使用正則表達式：

import re
r = re.compile('Master(.*?)thon')
m = r.search(str1)
if m:
    lyrics = m.group(1)

Answer 2

BeautifulSoup是做你想做的最簡單的方法。 它可以安裝如下：

sudo easy_install beautifulsoup

做你想做的樣本代碼是：

from BeautifulSoup import BeautifulSoup

doc = ['<div class="lyricbox">Hey You</div>']
soup = BeautifulSoup(''.join(doc))
print soup.find('div', {'class': 'lyricbox'}).string

您可以使用Python的urllib直接從url中獲取內容。 如果你想做更多的解析， Beautiful Soup doc也很有幫助。

Answer 3

def foo(s, leader, trailer):
  end_of_leader = s.index(leader) + len(leader)
  start_of_trailer = s.index(trailer, end_of_leader)
  return s[end_of_leader:start_of_trailer]

如果領導者不在字符串s中，或者預告片在此之后不存在（你沒有在這樣的異常條件中指定你想要的行為），則會引發ValueError;提出異常是非常自然和Pythonic要做的事情，讓調用者使用try / except來處理它，如果它知道在這種情況下該怎么做）。

基於RE的方法也是可行的，但我認為這種純字符串方法更簡單。

Answer 4

如果你從html頁面中提取任何數據，我強烈建議你使用BeautifulSoup庫。 我也使用它從html中提取數據，效果很好。

Answer 5

如果您希望在列表中輸出所有匹配項，也可以嘗試此操作：

import re
str1 = "IWantToMasterPython"

out  = re.compile('Master(.*?)thon', re.DOTALL |  re.IGNORECASE).findall(str1)
if out :
    print out

如何在python中的2個其他字符串之間提取字符串？

問題描述

5 個解決方案

解決方案1
31 已采納 2009-09-04 00:23:53

解決方案2
10 2009-09-04 16:09:02

解決方案3
8 2009-09-04 00:24:59

解決方案4
2 2009-09-04 10:51:25

解決方案5
2 2013-02-06 11:43:55

如何在python中的2個其他字符串之間提取字符串？

問題描述

5 個解決方案

解決方案1 31 已采納 2009-09-04 00:23:53

解決方案2 10 2009-09-04 16:09:02

解決方案3 8 2009-09-04 00:24:59

解決方案4 2 2009-09-04 10:51:25

解決方案5 2 2013-02-06 11:43:55

解決方案1
31 已采納 2009-09-04 00:23:53

解決方案2
10 2009-09-04 16:09:02

解決方案3
8 2009-09-04 00:24:59

解決方案4
2 2009-09-04 10:51:25

解決方案5
2 2013-02-06 11:43:55