简体   繁体   English

当所需元素包含子元素时,XPath不匹配吗?

[英]XPath doesn't match when desired element contains child elements?

I have the following XPath expression: 我有以下XPath表达式:

//a[@attribute='my-attribute']

When I have the following element in the HTML that XPath is searching, it matches as expected: 当我在XPath搜索的HTML中具有以下元素时,它会按预期匹配:

<a attribute="my-attribute">Some text</a>

But if there is an <svg> tag under that element, XPath returns no match: 但是,如果该元素下有一个<svg>标记,则XPath不返回匹配项:

<a attribute="my-attribute">
    <svg xmlns="http://www.w3.org/2000/svg" width="100%" height="100%"
        viewBox="0 0 24 24" focusable="false"></svg>
</a>

Why doesn't XPath match in this case? 为什么在这种情况下XPath不匹配? Is there a way I can modify my expression to make it match? 有没有一种方法可以修改表达式以使其匹配?

EDIT: 编辑:

Apparently it has to do with the namespace on the <svg> element. 显然,它与<svg>元素上的名称空间有关。 Using the local-name() function makes it match in the XPath tester I'm using: 使用local-name()函数使其在我正在使用的XPath测试器中匹配:

//*[local-name()='a' and @attribute='my-attribute']

However , this still doesn't match when running through Selenium WebDriver. 但是 ,通过Selenium WebDriver运行时,这仍然不匹配。 Any idea of how to get this working with Selenium? 关于如何与Selenium一起工作的任何想法吗?

You may be confused by how the XPath hosting environment is presenting the selected a elements. 您可以通过XPath的托管环境是如何呈现所选择的混淆a元素。

Adding an svg element to the a element will not affect what's selected by 添加一个svg元素的a元素不会影响什么的选择

//a[@attribute='my-attribute']

In the case of 如果是

<a attribute="my-attribute">Some text</a>

the a element has a string value consisting of more than just white space characters, but with a元素的字符串值不仅仅包含空格字符,还包含

<a attribute="my-attribute">
    <svg xmlns="http://www.w3.org/2000/svg" width="100%" height="100%"
        viewBox="0 0 24 24" focusable="false"></svg>
</a>

the a element has a string value that consists only of whites space, so for text results of the selection, you wouldn't see anything selected. a元素有只包含白色空间的字符串值,所以选择的文本结果,选择任何你不会看到。

If you evaluate count(//a[@attribute='my-attribute']) , you'll likely see the same results for both cases. 如果您对count(//a[@attribute='my-attribute'])进行评估,则在两种情况下您都可能会看到相同的结果。

Following is a possible solution in vb.net. 以下是vb.net中可能的解决方案。

Public Class XmlNodeListWithNamespace
    ' see https://stackoverflow.com/questions/55385520/xpath-doesnt-match-when-desired-element-contains-child-elements
    ' @JaSON I would have thought the same thing,
    ' but removing the xmlns attribute from the svg tag
    ' causes the //a[@attribute='my-attribute'] expression to match.
    ' – Andrew Mairose
    ' Mar 28 at 13:07 "Asked 5 months ago  Active 5 months ago" implies 2019-03-28 13:07.

    ' Therefore, I first considered deleting all occurrences of
    '       xmlns="" and xmlns="http://www.w3.org/1999/xhtml"
    ' I did this using the following Replacement.
    ' gstrHtml = Regex.Replace(
    '            input:=gstrHtml,
    '            pattern:=" *xmlns=""[^""]*""",
    '            replacement:="",
    '            options:=RegexOptions.IgnoreCase
    '        )

    ' However, the solution below retains the namespace, while avoiding unsightly xpath strings.

    ''' <summary>
    ''' For a given xpath, returns an XmlNodeList, taking account of the xmlns namespace.
    ''' </summary>
    ''' <param name="oXmlDocument">The current XML document.</param>
    ''' <param name="xpath">A normal xpath string, without any namespace qualifier.</param>
    ''' <returns>The XmlNodeList for the given xpath.</returns>
    Public Shared Function NodeList(
        oXmlDocument As XmlDocument,
        xpath As String
    ) As XmlNodeList

        Dim strXpath As String = xpath

        ' Insert Namespace Qualifier.  For example, 
        '    "//pre"                                            becomes "//x:pre"
        '    "/html/body/form/div/pre"                          becomes "/x:html/x:body/x:form/x:div/x:pre"
        '    "//div[@id='nv_bot_contents']/pre"                 becomes "//x:div[@id='nv_bot_contents']/x:pre"
        '    "//div[@id='nv_bot_contents']/pre[@data-xxx='X2']" becomes "//x:div[@id='nv_bot_contents']/x:pre[@data-xxx='X2']"
        '    "//div[@id='nv_bot_contents']/pre[@data-xxx]"      becomes "//x:div[@id='nv_bot_contents']/x:pre[@data-xxx]"
        '    "//pre[@data-xxx]"                                 becomes "//x:pre[@data-xxx]"
        strXpath = Regex.Replace(
                        input:=strXpath,
                        pattern:="(/)(\w+)",
                        replacement:="$1x:$2"
                    )

        ' See https://stackoverflow.com/questions/40796231/how-does-xpath-deal-with-xml-namespaces/40796315#40796315
        Dim oXmlNamespaceManager As New XmlNamespaceManager(nameTable:=oXmlDocument.NameTable)
        oXmlNamespaceManager.AddNamespace("x", "http://www.w3.org/1999/xhtml")

        Dim oXmlNodeList As XmlNodeList = oXmlDocument.SelectNodes(
            xpath:=strXpath,
            nsmgr:=oXmlNamespaceManager
        )

        Return oXmlNodeList

    End Function

End Class

Sample invocation: 样本调用:

Dim oXmlNodeList As XmlNodeList =
            XmlNodeListWithNamespace.NodeList(
                oXmlDocument:=oXmlDocument,
                xpath:="//pre"
            )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM