[英]How to get data from a combobox using Beautifulsoup and Python?
Im trying to get the selected text from a combobox. I was using asfad to search the text for the "selected" state but in the source code there are two items with that state, and I only want the one that is selected.我试图从 combobox 中获取选定的文本。我正在使用 asfad 搜索“选定的”state 的文本,但在源代码中有两个项目带有 state,我只想要被选中的那个。
I am using the following code to search for it by the "span" and the id but I get the following:我正在使用以下代码通过“span”和 id 搜索它,但我得到以下信息:
login_request = session.post(url,data=payload, headers=headers,
cookies=cookies)
sexo = BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo-
button").get_text(strip=True)
print(sexo)
Error: 'NoneType' object has no attribute 'get_text'
错误:“NoneType”object 没有属性“get_text”
And it was also looking for it in the following way, obtaining a "none"而它也是通过下面的方式寻找的,得到了一个“无”
login_request = session.post(url,data=payload, headers=headers,
cookies=cookies)
sexo = BeautifulSoup(login_request.text, 'lxml').select('select[name=sexo] option[selected]')
[0].value
print(sexo)
Origin:起源:
<tr class="FacetFilaTD"><td class="FacetDataTDPADL5">Sexo</td><td colspan="3"
class="FacetFieldCaptionTD">
<div class="control-valid"><select name="sexo" style="width: 140px; display: none;" id="sexo"
class="clasecombo_valid" tabindex="0" aria-disabled="false">
<option selected="" value="">[SELECCIONAR]</option><option value="M"
selected="">Masculino</option><option value="F">Femenino</option></select><span><a class="ui-
selectmenu ui-widget ui-state-default ui-selectmenu-dropdown ui-state-active ui-corner-top"
id="sexo-button" role="button" href="#nogo" tabindex="0" aria-haspopup="true" aria-owns="sexo-
menu" aria-disabled="false" style="width: 140px;"><span class="ui-selectmenu-
status">Masculino</span><span class="ui-selectmenu-icon ui-icon ui-icon-triangle-1-s"></span>
</a></span><span class="requerido">*</span></div>
</td></tr>
I just want to get the selected item which in this case is "Masculino"我只想获得所选项目,在这种情况下是“男性”
From what I can see of the html, there is no span
with id="sexo- button"
, so BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- button")
would have returned None
, which is why you got the error from get_text
.从我看到的 html 来看,
id="sexo- button"
没有span
,所以BeautifulSoup(login_request.text, 'lxml').find("span",id="sexo- button")
会有返回None
,这就是你从get_text
得到错误的原因。
As for your second attempt, I don't think bs4 Tags have a value
property, which is why you'd be getting None
that time.至于你的第二次尝试,我认为 bs4 Tags 没有
value
属性,这就是为什么你当时会得到None
。
You should actually try combining the two like:您实际上应该尝试将两者结合起来,例如:
sexo = BeautifulSoup(login_request.text, 'lxml').select_one('select[name=sexo] option[selected]').get_text(strip=True)
(If it doesn't work, you should individually print to see what ...select_one('select[name=sexo]')
and ...select_one('select[name=sexo] option[selected]')
will return, in case the page itself wasn't loaded properly by session
) (如果它不起作用,您应该单独打印以查看
...select_one('select[name=sexo]')
和...select_one('select[name=sexo] option[selected]')
将返回什么,以防页面本身未被session
正确加载)
You should also note that with the combined code, you'll actually get [SELECCIONAR]
, since that's also selected according to the html provided.您还应该注意,使用组合代码,您实际上会得到
[SELECCIONAR]
,因为它也是根据提供的 html 选择的。 To skip that, you can instead try:要跳过它,您可以尝试:
selectedOpts = BeautifulSoup(login_request.text, 'lxml').select('select[name=sexo] option[selected]')
selectedOpts = [s for s in selectedOpts if s.get('value')]
sexo = selectedOpts[0] if selectedOpts else None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.