简体   繁体   English

硒-无法从span元素获取文本

[英]Selenium - cant get text from span element

I'm very confused by getting text using Selenium. 我很困惑使用Selenium获取文本。

There are span tags with some text inside them. 有一些span标签,其中带有一些文本。 When I search for them using driver.find_element_by_... , everything works fine. 当我使用driver.find_element_by_...搜索它们时,一切正常。

But the problem is that the text can't be got from it. 但是问题是文本无法从中获取。

The span tag is found because I can't use .get_attribute('outerHTML') command and I can see this: 找到span标签是因为我无法使用.get_attribute('outerHTML')命令,并且可以看到以下内容:

<span class="branding">ThrivingHealthy</span>

But if I change .get_attribute('outerHTML') to .text it returns empty text which is not correct as you can see above. 但是,如果我将.get_attribute('outerHTML')更改为.text它将返回空文本,这是不正确的,如上面所示。

Here is the example (outputs are pieces of dictionary): 这是示例(输出是字典片段):

display_site = element.find_element_by_css_selector('span.branding').get_attribute('outerHTML')

'display_site': u'<span class="branding">ThrivingHealthy</span>'

display_site = element.find_element_by_css_selector('span.branding').text

'display_site': u''

As you can clearly see, there is a text but it does not finds it. 您可以清楚地看到,有一个文本,但找不到。 What could be wrong? 有什么事吗

EDIT: I've found kind of workaround. 编辑:我发现了一种解决方法。 I've just changed the .text to .get_attribute('innerText') 我刚刚将.text更改为.get_attribute('innerText')

But I'm still curious why it works this way? 但是我仍然很好奇为什么它会这样工作?

The problem is that there are a LOT of tags that are fetched using span.branding . 问题是有很多使用span.branding获取的标签。 When I just queried that page using find_elements (plural), it returned 20 tags. 当我刚刚使用find_elements (复数)查询该页面时,它返回了20个标签。 Each tag seems to be doubled... I'm not sure why but my guess is that one set is hidden while the other is visible. 每个标签似乎都翻了一番。。。我不确定为什么,但是我的猜测是一组隐藏了,而另一组可见。 From what I can tell, the first of the pair is hidden. 据我所知,这对中的第一个是隐藏的。 That's probably why you aren't able to pull text from it. 这可能就是为什么您无法从中提取文本的原因。 Selenium's design is to not interact with elements that a user can't interact with. Selenium的设计是不与用户无法交互的元素进行交互。 That's likely why you can get the element but when you try to pull text, it doesn't work. 这可能就是您可以获取元素的原因,但是当您尝试提取文本时,它不起作用。 Your best bet is to pull the entire set with find_elements and then just loop through the set getting the text. 最好的选择是使用find_elements拉整个集合,然后遍历整个集合以获取文本。 You will loop through like 20 and only get text from 10 but it looks like you'll still get the entire set anyway. 您将像20一样循环,仅从10中获取文本,但无论如何您仍然会获得整个集合。 It's weird but it should work. 很奇怪,但是应该可以。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM