如何在lxml中使用cssselect选择具有完全类的元素？

Question

I´m scraping a web with lxml html, but I´m getting a problem. 我正在使用lxml html抓取网站，但是遇到了问题。 When I make a selection of HTML for example: 例如，当我选择HTML时：

 html.cssselect('a.asig')

I must get the elements with class="asig" but the selection also prints the elements that contains "asig" in his id for example: 我必须使用class =“ asig”来获取元素，但是选择还会打印出其id中包含“ asig”的元素，例如：

<a class="asig drcha" ...>

What could I do for get only the elements with "asig" and not the elements that contains asig? 我该怎么做才能只获取带有“ asig”的元素，而不获取包含asig的元素？ Thanks! 谢谢！

Answer 1

Use either html.xpath and adjust accordingly, or be very implicit when declaring the class to locate. 使用html.xpath并进行相应调整，或者在声明要定位的类时使用非常隐式的形式。 See the following code. 请参阅以下代码。

from lxml import html

sample = '<?xml version="1.0" encoding="UTF-8"?><root><a class="asig">I am the correct one.</a><a class="asig drcha">I am the wrong one.</a></root>'
tree = html.fromstring(sample)
print tree.xpath("//a[@class='asig']/text()")[0]
print tree.cssselect("a[class='asig']")[0].text

Result is as follows: 结果如下：

I am the correct one.
I am the correct one.
[Finished in 0.2s]

Notice how cssselect was used in the last line. 注意最后一行中如何使用cssselect 。 Hope this helps. 希望这可以帮助。

如何在lxml中使用cssselect选择具有完全类的元素？

问题描述

1 个解决方案

解决方案1
4 已采纳 2014-04-21 18:45:07

如何在lxml中使用cssselect选择具有完全类的元素？

问题描述

1 个解决方案

解决方案1 4 已采纳 2014-04-21 18:45:07

解决方案1
4 已采纳 2014-04-21 18:45:07