findall和xpath问题

Question

I have a text file contains some HTML code called "html.txt" as shown as below: 我有一个文本文件，其中包含一些名为“ html.txt”的HTML代码，如下所示：

<tr>
    <td class="name"><a href="/player/DAVID:RD" class=""><span>David Kwan</span> (DAVID)</a></td>
    <td class="teamid" style="">DAVID:RD</td>
    <td class="">District Player</td>
    <td class="">Red-Dragon Factory</td>
</tr>

Referring to the tutorial I read from the lxml website, I tried to use the etree and findall() methods to extract the table data from the HTML code, but somehow I'm not able to print out in string format, the result I get is <Element td at 0x267c1c0> . 关于我从lxml网站阅读的教程，我尝试使用etree和findall()方法从HTML代码中提取表数据，但是由于某种原因，我无法以字符串格式打印出来，结果得到了是<Element td at 0x267c1c0> 。
I understand a set or list will return similar when using the findall method, but even if I use the index 0 it also does not help. 我知道使用findall方法时，集或列表将返回相似的结果，但是即使使用索引0，它也无济于事。 Also, using trial and error I attempted to use the str function that support the xpath to force findall return in string format also does not help. 另外，通过尝试和错误，我尝试使用支持xpath的str函数强制以字符串格式返回findall也无济于事。

Can someone advise me on how to correct this? 有人可以建议我如何纠正此问题吗？

from lxml import etree

page = open("C:/Python27/project/lxml_project/html.txt").read()
x = etree.HTML(page)
element = (x.findall('.//td[@class="teamid"]'))
print(element)

My second question is if I use the xpath instead of findall method, will it be a better solution? 我的第二个问题是，如果我使用xpath而不是findall方法，它将是一个更好的解决方案吗？ Previously when I tried xpath, it always returned me the first search element even I have multiples of table data <td> tags in the entire html page. 以前，当我尝试xpath时，即使我在整个html页面中有多个表数据<td>标记，它也总是返回第一个搜索元素。 Is it possible to implement xpath recursively with the use of Python LXML library? 是否可以使用Python LXML库递归实现xpath ？

Answer 1

Use the Element.text attribute to retrieve the text content of an element: 使用Element.text属性检索元素的文本内容：

elements = x.findall('.//td[@class="teamid"]')
print([elem.text for elem in elements])

.findall() returns a list; .findall()返回一个列表； you can use .find() to retrieve just the first match (or None if there are no matching elements). 您可以使用.find()仅检索第一个匹配项（如果没有匹配的元素，则为None ）。

findall和xpath问题

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-05-31 10:36:56

findall和xpath问题

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-05-31 10:36:56

解决方案1
1 已采纳 2014-05-31 10:36:56