简体   繁体   English

如何使用 Beautifulsoup 和 Python 从 combobox 获取数据?

[英]How to get data from a combobox using Beautifulsoup and Python?

Im trying to get the selected text from a combobox. I was using asfad to search the text for the "selected" state but in the source code there are two items with that state, and I only want the one that is selected.我试图从 combobox 中获取选定的文本。我正在使用 asfad 搜索“选定的”state 的文本,但在源代码中有两个项目带有 state,我只想要被选中的那个。

I am using the following code to search for it by the "span" and the id but I get the following:我正在使用以下代码通过“span”和 id 搜索它,但我得到以下信息:

login_request = session.post(url,data=payload, headers=headers, 
cookies=cookies)
sexo = BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- 
button").get_text(strip=True)
print(sexo)

Error: 'NoneType' object has no attribute 'get_text'错误:“NoneType”object 没有属性“get_text”

And it was also looking for it in the following way, obtaining a "none"而它也是通过下面的方式寻找的,得到了一个“无”

login_request = session.post(url,data=payload, headers=headers, 
cookies=cookies)
sexo = BeautifulSoup(login_request.text, 'lxml').select('select[name=sexo] option[selected]') 
[0].value
print(sexo)

Origin:起源:

<tr class="FacetFilaTD"><td class="FacetDataTDPADL5">Sexo</td><td colspan="3" 
class="FacetFieldCaptionTD">
<div class="control-valid"><select name="sexo" style="width: 140px; display: none;" id="sexo" 
class="clasecombo_valid" tabindex="0" aria-disabled="false">
<option selected="" value="">[SELECCIONAR]</option><option value="M" 
selected="">Masculino</option><option value="F">Femenino</option></select><span><a class="ui- 
selectmenu ui-widget ui-state-default ui-selectmenu-dropdown ui-state-active ui-corner-top" 
id="sexo-button" role="button" href="#nogo" tabindex="0" aria-haspopup="true" aria-owns="sexo- 
menu" aria-disabled="false" style="width: 140px;"><span class="ui-selectmenu- 
status">Masculino</span><span class="ui-selectmenu-icon ui-icon ui-icon-triangle-1-s"></span> 
</a></span><span class="requerido">*</span></div>                                                                      
</td></tr>

I just want to get the selected item which in this case is "Masculino"我只想获得所选项目,在这种情况下是“男性”

From what I can see of the html, there is no span with id="sexo- button" , so BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- button") would have returned None , which is why you got the error from get_text .从我看到的 html 来看, id="sexo- button"没有span ,所以BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- button")会有返回None ,这就是你从get_text得到错误的原因。

As for your second attempt, I don't think bs4 Tags have a value property, which is why you'd be getting None that time.至于你的第二次尝试,我认为 bs4 Tags 没有value属性,这就是为什么你当时会得到None

You should actually try combining the two like:您实际上应该尝试将两者结合起来,例如:

sexo = BeautifulSoup(login_request.text, 'lxml').select_one('select[name=sexo] option[selected]').get_text(strip=True)

(If it doesn't work, you should individually print to see what ...select_one('select[name=sexo]') and ...select_one('select[name=sexo] option[selected]') will return, in case the page itself wasn't loaded properly by session ) (如果它不起作用,您应该单独打印以查看...select_one('select[name=sexo]')...select_one('select[name=sexo] option[selected]')将返回什么,以防页面本身未被session正确加载)

You should also note that with the combined code, you'll actually get [SELECCIONAR] , since that's also selected according to the html provided.您还应该注意,使用组合代码,您实际上会得到[SELECCIONAR] ,因为它也是根据提供的 html 选择的。 To skip that, you can instead try:要跳过它,您可以尝试:

selectedOpts = BeautifulSoup(login_request.text, 'lxml').select('select[name=sexo] option[selected]')
selectedOpts = [s for s in selectedOpts if s.get('value')]
sexo = selectedOpts[0] if selectedOpts else None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python- <a>使用BeautifulSoup</a>从<a>选项卡</a>获取数据 - Python - Get data from <a> tab using BeautifulSoup 使用 python 中的 beautifulsoup 从列表中获取数据 - Get data from list using beautifulsoup in python 尝试使用 python 中的 beautifulsoup 从表中获取数据 - Trying to get data from a table using beautifulsoup in python 使用urllib和BeautifulSoup从python 3中的HTML表中获取数据 - Get data from HTML table in python 3 using urllib and BeautifulSoup python:无法使用 BeautifulSoup 从 html 获取特定数据 - python: can't get specific data from html using BeautifulSoup 如何使用python中的beautifulsoup从网页中获取数据 - How do I get scrape data from web pages using beautifulsoup in python 如何获取<a>在 python 中使用 BeautifulSoup 的 href 属性中的数据?</a> - how can i get data that is in href attribute of <a> using BeautifulSoup in python? 如何使用Python和Beautifulsoup从脚本标签获取JavaScript变量 - How to get JavaScript variables from a script tag using Python and Beautifulsoup 如何使用 python beautifulsoup 从网站获取正确的链接? - How to get the proper link from a website using python beautifulsoup? 如何在python 3中使用Beautifulsoup从下一页获取文本? - How to get text from next pages using Beautifulsoup in python 3?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM