简体   繁体   English

如何在 div 中获取非标记文本的属性

[英]How do I get attribute for non tagged text inside div

I am currently working on a script using python(and selenium) and having some issues when I try to get the text (III. AIR COMBAT) inside this div:我目前正在使用 python(和 selenium)编写脚本,当我尝试在此 div 中获取文本(III. AIR COMBAT)时遇到一些问题:

<div class="vs901-4">
<i id="copyarticle" style="cursor:pointer; color:white;margin-right:10px;" class="fa fa-copy"></i>
<span id="copiednotif" class="badge badge-pills badge-success" style="text-weight:300;cursor:pointer; margin-left: 5px;margin-right:5px;"></span>
<span id="profileid" class="hidden"> website link</span>
                             III. AIR COMBAT</div>

so basically I tried the usual full xpath way:所以基本上我尝试了通常的完整 xpath 方式:

self.driver.find_element_by_xpath("/html/body/div[2]/div/div[2]/div[2]/div/div/div/div/div/div[1]/div[4]/text()").get_attribute("innerHTML") 

that's basically what I've been using for the other texts that I needed (the others weren't inside a non tagged div) and they all worked, but this one is giving me this error:这基本上就是我一直在用于我需要的其他文本的内容(其他文本不在未标记的 div 中)并且它们都有效,但是这个给了我这个错误:

"The result of the xpath expression "/html/body/div[2]/div/div[2]/div[2]/div/div/div/div/div/div[1]/div[4]/text()" is: [object Text]. It should be an element."

Thanks for the help.谢谢您的帮助。

The error looks to be in the /text() of the xpath, you are already getting the text in there and not the element, to then get the innerHTML.错误似乎在 xpath 的/text()中,您已经在其中获取文本而不是元素,然后获取 innerHTML。 Try with the following:尝试以下方法:

self.driver.find_element_by_xpath("/html/body/div[2]/div/div[2]/div[2]/div/div/div/div/div/div[1]/div[4]").get_attribute("innerHTML") 

or或者

self.driver.find_element_by_xpath("/html/body/div[2]/div/div[2]/div[2]/div/div/div/div/div/div[1]/div[4]/text()")

Here is the line of code that you can use to return text directly using javascript.这是您可以使用 javascript 直接返回文本的代码行。

def get_text_exclude_children(element):
    return driver.execute_script(
        """
        var parent = arguments[0];
        var child = parent.firstChild;
        var textValue = "";
        while(child) {
            if (child.nodeType === Node.TEXT_NODE)
                textValue += child.textContent;
                child = child.nextSibling;
        }
        return textValue;""",
        element).strip() 

Now you can use the method as shown below.现在您可以使用如下所示的方法。

element = driver.find_element_by_xpath("//div[@class='vs901-4']")
elementOnlyText = get_text_exclude_children(element)
print(elementOnlyText)
```

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM