使用正则表达式刮擦没有括号的清洁科学名称

Question

I'm scraping scientific names from a website using regex, and I can't figure out how to not pull the parenthesis with the scientific name. 我正在使用正则表达式从网站上搜集科学名称，我无法弄清楚如何不用科学名称拉括号。

The HTML is written like this: HTML是这样写的：

<span class="SciName">(Acanthastrea bowerbanki)</span>

My regex is written like this: 我的正则表达式是这样写的：

regex = '<span class="SciName">(.+?)</span>'

My results look like this: 我的结果如下：

(Acanthastrea bowerbanki)

But I need them to look like this: 但我需要它们看起来像这样：

Acanthastrea bowerbanki

Answer 1

You need an extra pair of parentheses, which you must escape with backslashes to make them literal characters: 你需要一对额外的括号，你必须用反斜杠转义它们以使它们成为文字字符：

regex = r'<span class="SciName">\((.+?)\)</span>'

You will use this as in: 您将使用此作为：

import re

text = '<span class="SciName">(Acanthastrea bowerbanki)</span>'
regex = r'<span class="SciName">\((.+?)\)</span>'
m = re.match(regex, text)
print m.group(1)

Answer 2

You don't need to use regex for this. 您不需要使用正则表达式。

s = 'blah blah blah (Acanthastrea bowerbanki) blah blah blah'

scientistName = s[s.find("(")+1:s.find(")")]

使用正则表达式刮擦没有括号的清洁科学名称

问题描述

2 个解决方案

解决方案1
3 已采纳 2013-10-31 21:22:53

解决方案2
0 2013-10-31 21:25:05

使用正则表达式刮擦没有括号的清洁科学名称

问题描述

2 个解决方案

解决方案1 3 已采纳 2013-10-31 21:22:53

解决方案2 0 2013-10-31 21:25:05

解决方案1
3 已采纳 2013-10-31 21:22:53

解决方案2
0 2013-10-31 21:25:05