美丽的汤元内容标记

Question

<meta itemprop="streetAddress" content="4103 Beach Bluff Rd">

I have to get the content '4103 Beach Bluff Rd'. 我必须得到内容'4103 Beach Bluff Rd'。 I'm trying to get this done with BeautifulSoup so, I'm trying this: 我想用BeautifulSoup完成这个，所以，我正在尝试这个：

soup = BeautifulSoup('<meta itemprop="streetAddress" content="4103 Beach Bluff Rd"> ')

soup.find(itemprop="streetAddress").get_text()

but I'm getting an empy string as result, which may have sense given that when a print the soup object 但是我得到一个empy字符串作为结果，这可能有意义，因为当打印汤对象

print soup

I get the this: 我明白了：

<html><head><meta content="4103 Beach Bluff Rd" itemprop="streetAddress"/> </head></html>

Apparently the data I want is in the 'meta content' tag, how can I get this data? 显然，我想要的数据是在“元内容”标签中，我该如何获取这些数据？

Answer 1

soup.find(itemprop="streetAddress").get_text()

You are getting the text of a matched element. 您将获得匹配元素的文本。 Instead, get the "content" attribute value : 相反， 获取“content”属性值 ：

soup.find(itemprop="streetAddress").get("content")

This is possible since BeautifulSoup provides a dictionary-like interface to tag attributes : 这是可能的，因为BeautifulSoup为标记属性提供了类似字典的界面：

You can access a tag's attributes by treating the tag like a dictionary. 您可以通过将标记视为字典来访问标记的属性。

Demo: 演示：

>>> from bs4 import BeautifulSoup
>>>
>>> soup = BeautifulSoup('<meta itemprop="streetAddress" content="4103 Beach Bluff Rd"> ')
>>> soup.find(itemprop="streetAddress").get_text()
u''
>>> soup.find(itemprop="streetAddress").get("content")
'4103 Beach Bluff Rd'

美丽的汤元内容标记

问题描述

1 个解决方案

解决方案1
11 已采纳 2015-12-16 02:03:13

美丽的汤元内容标记

问题描述

1 个解决方案

解决方案1 11 已采纳 2015-12-16 02:03:13

解决方案1
11 已采纳 2015-12-16 02:03:13