美丽的汤只需获取标签内的值

Question

The following command:以下命令：

volume = soup.findAll("span", {"id": "volume"})[0]

gives:给出：

<span class="gr_text1" id="volume">16,103.3</span>

when I issue a print(volume).当我发出打印（卷）时。

How do I get just the number?我怎么只得到号码？

Answer 1

从元素中提取字符串：

volume = soup.findAll("span", {"id": "volume"})[0].string

Answer 2

Using css selector :使用css 选择器：

>>> soup.select('span#volume')[0].text
u'16,103.3'

Answer 3

try this:尝试这个：

for a in volume:
    a.get_text()

recent response to this question 最近对这个问题的回答

managing exceptions 1 管理异常 1

managing exceptions 2 管理异常 2

Answer 4

Just to add , I also found the .string dosn't do well when there is <br> in the text.补充一点，我还发现当文本中有<br>时.string 。

EG:例如：

 <div class = "Lines">
    <span> First Line <br> Second Line <br> Third Line </span>
  </div>

If we do a soup.find("div",attrs={"class":"Lines}).span.string we get a None如果我们做一个soup.find("div",attrs={"class":"Lines}).span.string我们得到一个None

But a soup.find("div",attrs={"class":"Lines}).span.text we get但是我们得到了一个soup.find("div",attrs={"class":"Lines}).span.text

 First Line Second Line Third Line

I think the .string gives a NavigatableString object and .text gives a unicode object.我认为.string给出了一个NavigatableString对象，而.text给出了一个 unicode 对象。

Answer 5

There is a function for getting the value of the tag : tag.contents[0]有一个获取标签值的函数：tag.contents[0]

Try this :尝试这个：

volumes = soup('span')
for volume in volumes:
     print(volume.contents[0])

美丽的汤只需获取标签内的值

问题描述

5 个解决方案

解决方案1
34 已采纳 2014-02-25 02:22:47

解决方案2
9 2014-02-25 02:23:19

解决方案3
2 2018-03-28 10:19:39

解决方案4
0 2020-03-31 07:41:47

解决方案5
0 2020-09-07 17:19:41

美丽的汤只需获取标签内的值

问题描述

5 个解决方案

解决方案1 34 已采纳 2014-02-25 02:22:47

解决方案2 9 2014-02-25 02:23:19

解决方案3 2 2018-03-28 10:19:39

解决方案4 0 2020-03-31 07:41:47

解决方案5 0 2020-09-07 17:19:41

解决方案1
34 已采纳 2014-02-25 02:22:47

解决方案2
9 2014-02-25 02:23:19

解决方案3
2 2018-03-28 10:19:39

解决方案4
0 2020-03-31 07:41:47

解决方案5
0 2020-09-07 17:19:41