[英]findall and xpath problems
I have a text file contains some HTML code called "html.txt" as shown as below: 我有一个文本文件,其中包含一些名为“ html.txt”的HTML代码,如下所示:
<tr>
<td class="name"><a href="/player/DAVID:RD" class=""><span>David Kwan</span> (DAVID)</a></td>
<td class="teamid" style="">DAVID:RD</td>
<td class="">District Player</td>
<td class="">Red-Dragon Factory</td>
</tr>
Referring to the tutorial I read from the lxml website, I tried to use the etree
and findall()
methods to extract the table data from the HTML code, but somehow I'm not able to print out in string format, the result I get is <Element td at 0x267c1c0>
. 关于我从lxml网站阅读的教程,我尝试使用
etree
和findall()
方法从HTML代码中提取表数据,但是由于某种原因,我无法以字符串格式打印出来,结果得到了是<Element td at 0x267c1c0>
。
I understand a set or list will return similar when using the findall
method, but even if I use the index 0 it also does not help. 我知道使用
findall
方法时,集或列表将返回相似的结果,但是即使使用索引0,它也无济于事。 Also, using trial and error I attempted to use the str
function that support the xpath
to force findall
return in string format also does not help. 另外,通过尝试和错误,我尝试使用支持
xpath
的str
函数强制以字符串格式返回findall
也无济于事。
Can someone advise me on how to correct this? 有人可以建议我如何纠正此问题吗?
from lxml import etree
page = open("C:/Python27/project/lxml_project/html.txt").read()
x = etree.HTML(page)
element = (x.findall('.//td[@class="teamid"]'))
print(element)
My second question is if I use the xpath
instead of findall
method, will it be a better solution? 我的第二个问题是,如果我使用
xpath
而不是findall
方法,它将是一个更好的解决方案吗? Previously when I tried xpath, it always returned me the first search element even I have multiples of table data <td>
tags in the entire html page. 以前,当我尝试xpath时,即使我在整个html页面中有多个表数据
<td>
标记,它也总是返回第一个搜索元素。 Is it possible to implement xpath
recursively with the use of Python LXML library? 是否可以使用Python LXML库递归实现
xpath
?
Use the Element.text
attribute to retrieve the text content of an element: 使用
Element.text
属性检索元素的文本内容:
elements = x.findall('.//td[@class="teamid"]')
print([elem.text for elem in elements])
.findall()
returns a list; .findall()
返回一个列表; you can use .find()
to retrieve just the first match (or None
if there are no matching elements). 您可以使用
.find()
仅检索第一个匹配项(如果没有匹配的元素,则为None
)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.