简体   繁体   English

如何使用xpath迭代匹配css类的DOM元素?

[英]How to iterate through DOM elements that match a css class using xpath?

I'm processing an HTML page with a variable number of p elements with a css class "myclass", using Python + Selenium RC. 我正在使用Python + Selenium RC处理带有可变数量的p元素的HTML页面,其中css类为“myclass”。

When I try to select each node with this xpath: 当我尝试使用此xpath选择每个节点时:

//p[@class='myclass'][n]

(with na natural number) (用na自然数)

I get only the first p element with this css class for every n, unlike the situation if I iterate through selecting ALL p elements with: 对于每个n,我只得到第一个带有这个css类的p元素,这与我在迭代选择所有p元素时的情况不同:

//p[n]

Is there any way I can iterate through elements by css class using xpath? 有没有办法可以使用xpath通过css类迭代元素?

XPath 1.0 doesn't provide an iterating construct . XPath 1.0不提供迭代构造

Iteration can be performed on the selected node-set in the language that is hosting XPath. 可以使用托管XPath的语言对选定的节点集执行迭代。

Examples : 示例

In XSLT 1.0 : 在XSLT 1.0中

   <xsl:for-each select="someExpressionSelectingNodes">
     <!-- Do something with the current node -->
   </xsl:for-each>

In C# : 在C#中

using System;
using System.IO;
using System.Xml;

public class Sample {

  public static void Main() {

    XmlDocument doc = new XmlDocument();
    doc.Load("booksort.xml");

    XmlNodeList nodeList;
    XmlNode root = doc.DocumentElement;

    nodeList=root.SelectNodes("descendant::book[author/last-name='Austen']");

    //Change the price on the books.
    foreach (XmlNode book in nodeList)
    {
      book.LastChild.InnerText="15.95";
    }

    Console.WriteLine("Display the modified XML document....");
    doc.Save(Console.Out);

  }
}

XPath 2.0 has its own iteration construct : XPath 2.0有自己的迭代结构

   for $varname1 in someExpression1,
       $varname2 in someExpression2, 
      .  .  .  .  .  .  .  .  .  .  .
       $varnameN in someExpressionN 
    return
        SomeExpressionUsingTheVarsAbove

Now that I look again at this question, I think the real problem is not in iterating , but in using // . 现在我再看一下这个问题,我认为真正的问题不是迭代 ,而是使用//

This is a FAQ : 这是一个FAQ

//p[@class='myclass'][1] 

selects every p element that has a class attribute with value "myclass" and that is the first such child of its parent. 选择具有值为"myclass"class属性的每个 p元素,这是其父元素的第一个子元素。 Therefore this expression may select many p elements, none of which is really the first such p element in the document. 因此,该表达式可以选择许多p元素,其中没有一个元素实际上是文档中的第一个这样的p元素。

When we want to get the first p element in the document that satisfies the above predicate, one correct expression is: 当我们想要获得满足上述谓词的文档中的第一个p元素时,一个正确的表达式是:

(//p)[@class='myclass'][1] 

Remember : The [] operator has a higher priority (precedence) than the // abbreviation. 请记住[]运算符的优先级(优先级)高于//缩写。 WHanever you need to index the nodes selected by // , always put the expression to be indexed in brackets. 无论您需要索引由//选择的节点,始终将表达式放在括号中。

Here is a demonstration : 这是一个演示

<nums>
 <a>
  <n x="1"/>
  <n x="2"/>
  <n x="3"/>
  <n x="4"/>
 </a>
 <b>
  <n x="5"/>
  <n x="6"/>
  <n x="7"/>
  <n x="8"/>
 </b>
</nums>

The XPath expression : XPath表达式

//n[@x mod 2 = 0][1]

selects the following two nodes : 选择以下两个节点

<n x="2" />
<n x="6" />

The XPath expression : XPath表达式

(//n)[@x mod 2 = 0][1]

selects exactly the first n element in the document with the wanted property: 使用want属性精确选择文档中的前n元素

<n x="2" />

Try this first with the following transformation : 首先尝试以下转换

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select="//n[@x mod 2 = 0][1]"/>
 </xsl:template>
</xsl:stylesheet>

and the result is two nodes . 结果是两个节点

<n x="2" />
<n x="6" />

Now, change the XPath expression as below and try again : 现在,更改XPath表达式,如下所示,然后重试

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select="(//n)[@x mod 2 = 0][1]"/>
 </xsl:template>
</xsl:stylesheet>

and the result is what we really wanted -- the first such n element in the document: 结果就是我们真正想要的 - 文档中第一个这样的n元素:

<n x="2" />

Maybe all your divs with this class are at the same level, so by //p[@class='myclass'] you receive the array of paragraphs with the specified class. 也许这个类的所有div都在同一级别,所以通过// p [@ class ='myclass']你会收到带有指定类的段落数组。 So you should iterate through it using indexes, ie //p[@class='myclass'][1], //p[@class='myclass'][2],...,//p[@class='myclass'][last()] 所以你应该使用索引迭代它,即// p [@ class ='myclass'] [1],// p [@ class ='myclass'] [2],...,// p [@ class = 'MyClass的'] [最后()]

I don't think you're using the "index" for it's real purpose. 我不认为你使用“索引”是因为它的真正目的。 The //p[selection][index] syntax in this selection is actually telling you which element within its parent it should be... So //p[selection][1] is saying that your selected p must be the first child of its parent. 这个//p[selection][index]//p[selection][index]语法实际上告诉你它的父元素应该是哪个元素......所以//p[selection][1]说你选择的p必须是第一个孩子它的父母。 //p[selection][2] is saying it must be the 2nd child. //p[selection][2]说它必须是第二个孩子。 Depending on your html, it's likely this isn't what you want. 根据你的html,这可能不是你想要的。

Given that you're using Selenium and Python, there's a couple ways to do what you want, and you can look at this question to see them (there are two options given there, one in selenium Javascript, the other using the server-side selenium calls). 鉴于你正在使用Selenium和Python,有几种方法可以做你想要的,你可以看看这个问题来看看它们(有两个选项,一个在selenium Javascript,另一个在服务器端)硒叫)。

Here's a C# code snippet that might help you out. 这是一个可能帮助你的C#代码段。

The key here is the Selenium function GetXpathCount() . 这里的关键是Selenium函数GetXpathCount() It should return the number of occurrences of the Xpath expression you are looking for. 它应该返回您要查找的Xpath表达式的出现次数。

You can enter //p[@class='myclass'] in XPather or any other Xpath analysis tool so you can indeed verify multiple results are returned. 您可以在XPather或任何其他Xpath分析工具中输入//p[@class='myclass'] ,这样您就可以确认返回多个结果。 Then you just iterate through the results in your code. 然后,您只需在代码中迭代结果。

In my case, it was all the list items in an UL that needed to be iterated -ie //li[@class='myclass']/ul/li - so based on your requirements should be something like: 在我的例子中,UL中的所有列表项都需要迭代-ie //li[@class='myclass']/ul/li - 所以基于你的要求应该是这样的:

int numProductsInLeftNav = Convert.ToInt32(selenium.GetXpathCount("//p[@class='myclass']"));

List<string> productsInLeftNav = new List<string>();
for (int i = 1; i <= numProductsInLogOutLeftNav; i++) {
    string productName = selenium.GetText("//p[@class='myclass'][" + i + "]");
    productsInLogoutLeftNav.Add(productName);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM