Python正則表達式匹配失敗

Question

這通過https://regex101.com/傳遞，沒有任何問題。 我有想念嗎？ 整個字符串在一行中。

def get_title_and_content(html):
  html = """<!DOCTYPE html>     <html>       <head>       <title>Change delivery date with Deliv</title>       </head>       <body>       <div class="gkms web">The delivery date can be changed up until the package is assigned to a driver.</div>       </body>     </html>  """
  title_pattern = re.compile(r'<title>(.*?)</title>(.*)')
  match = title_pattern.match(html)
  if match:
    print('successfully extract title and answer')
      return match.groups()[0].strip(), match.groups()[1].strip()
    else:
      print('unable to extract title or answer')

Answer 1

在評論摘要中：

title_pattern.search(html)應該被用來代替title_pattern.match(html)

由於搜索功能將在提供的字符串中的任何位置進行搜索，而不僅僅是從頭開始。 match = title_pattern.findall(html)可以類似地使用，但將返回項目列表，而不只是一個項目列表。

就像前面提到的，使用BeautifulSoup從長遠來看會付出更多，因為正則表達式不適合搜索HTML

Answer 2

注釋是正確的，re.match（）從頭開始搜索。 就是說，在您的正則表達式中插入。*，以便從頭開始搜索：

title_pattern = re.compile(r'.*<title>(.*?)</title>(.*)')

Python正則表達式匹配失敗

問題描述

2 個解決方案

解決方案1
0 2018-05-30 23:37:14

解決方案2
0 2018-05-31 02:03:22

Python正則表達式匹配失敗

問題描述

2 個解決方案

解決方案1 0 2018-05-30 23:37:14

解決方案2 0 2018-05-31 02:03:22

解決方案1
0 2018-05-30 23:37:14

解決方案2
0 2018-05-31 02:03:22