从下拉菜单中抓取价值

Question

I am trying to web scrape the value and text from a dropdown element on a webpage using a combination of Python with selenium and Beautiful Soup. 我正在尝试使用Python与Selenium和Beautiful Soup的组合从Web上的下拉列表元素中抓取值和文本。

I am able to get the text but I am not able to get the value through the get_attribute command. 我可以获取文本，但无法通过get_attribute命令获取值。

When I print the element that I located on the webpage it returns the following content 当我打印位于网页上的元素时，它将返回以下内容

打印（价格）

The print statement that gets it gives the error: 得到它的打印语句给出错误：

None Type object is not callable

price=soup.find("select",{"id":"space-prices"})
print(price)
print(price.text)
print(price.get_attribute('value'))

The output for print(price) is print（price）的输出是

<select class="pricing-bar-select" id="space-prices" name="space-prices"><option selected="selected" value="£360">Per Day</option>
<option value="£1,260">Per Week</option>
<option value="£5,460">Per Month</option>
<option value="£16,380">Per Quarter</option>
<option value="£65,520">Per Year</option></select>

The URL of the webpage is 该网页的网址是

https://www.appearhere.co.uk/spaces/north-kensington-upcycling-store-and-cafe https://www.appearhere.co.uk/spaces/north-kensington-upcycling-store-and-cafe

Answer 1

try this: 尝试这个：

from selenium import webdriver
from bs4 import BeautifulSoup


driver = webdriver.Chrome()
url= "https://www.appearhere.co.uk/spaces/north-kensington-upcycling-store-and-cafe"
driver.maximize_window()
driver.get(url)

content = driver.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")
price=soup.find("select",{"id":"space-prices"})
options = price.find_all("option")
options1=[y.text for y in options]
values = [o.get("value") for o in options]
for x in range(5):
    print options1[x], values[x].encode('utf8')
driver.quit()

It will print 它将打印

Per Day £360
Per Week £1,260
Per Month £5,460
Per Quarter £16,380
Per Year £65,520

Hope this is what you want 希望这就是你想要的

Answer 2

It's because get_attribute seems to be None . 这是因为get_attribute似乎为None 。 It's not a valid attribute of the prices object. 这不是prices对象的有效属性。 So it's not a function that you can call - hence the error. 因此，这不是您可以调用的函数-因此会出现错误。 If you took away the parentheses and just printed prices.get_attribute nothing would print, because the value is None . 如果您prices.get_attribute括号并仅打印prices.get_attribute不会打印任何内容，因为该值为None 。

Also, the <select> tag doesn't have a "value" attribute in the first place. 另外， <select>标记首先没有“值”属性。 What you've done is you've grabbed the <select> tag, and all of it's children. 您要做的是获取了<select>标记，它都是子标记。 Each child in the <select> tag (the <option> tags) have a "value" attribute. <select>标记（ <option>标记）中的每个子代都有一个“值”属性。 If you are trying to get all of the values of all of the <option> tags in that <select> , then you should do the following: 如果尝试获取该<select>中所有<option>标记的所有值，则应执行以下操作：

price=soup.find("select",{"id":"space-prices"})

# get all <options> in a list
options = price.find_all("option")

# for each element in that list, pull out the "value" attribute
values = [o.get("value") for o in options]
print(values)
#[u'\xa3360', u'\xa31,260', u'\xa35,460', u'\xa316,380', u'\xa365,520']

从下拉菜单中抓取价值

问题描述

2 个解决方案

解决方案1
3 已采纳 2016-11-11 04:26:48

解决方案2
1 2016-11-11 03:40:36

从下拉菜单中抓取价值

问题描述

2 个解决方案

解决方案1 3 已采纳 2016-11-11 04:26:48

解决方案2 1 2016-11-11 03:40:36

解决方案1
3 已采纳 2016-11-11 04:26:48

解决方案2
1 2016-11-11 03:40:36