[英]Trouble with data types after scraping a website with lxml and xpath
I'm scraping a website for data and end up pulling out numbers. 我正在抓取一个网站以获取数据,最终提取出数字。 The issue is when I try to perform logic functions in Python on the data it comes back as
问题是当我尝试在Python中对返回的数据执行逻辑功能时
class 'lxml.etree._ElementStringResult'
My question is can I typecast this data somehow into a string or int so I can then do my logic statements? 我的问题是我可以以某种方式将这些数据类型转换为字符串或整数,以便随后执行逻辑语句吗?
Here is the code: 这是代码:
callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()
print callType
Here is the output: 这是输出:
76
When I try control statements on the data nothing happens. 当我尝试对数据执行控制语句时,什么也没有发生。 I think it's because I'm trying logic on incorrect types.
我认为这是因为我正在尝试对错误类型进行逻辑处理。
callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()
print type(callType)
print callType
This is my output: 这是我的输出:
<class 'lxml.etree._ElementStringResult'>
76
So instead of trying to complete control statements with an "int", it is a different type. 因此,它不是尝试使用“ int”完成控制语句,而是另一种类型。 I've tried typecasting the variable but it remains that same datatype.
我尝试过类型转换变量,但它仍然是相同的数据类型。 Hope this helps...
希望这可以帮助...
xpath()
may return a list of _ElementStringResult
s, not plain Python strings. xpath()
可能会返回_ElementStringResult
的列表,而不是纯Python字符串。 The reason why you might sometimes wish to have _ElementStringResult
s is that unlike str
s they remember their parents (which they make accessible through the getparent
method). 有时您可能希望拥有
_ElementStringResult
的原因是,与str
不同,他们记得自己的父母(他们可以通过getparent
方法访问它们)。
You could convert this to a string or integer by simply passing the object to str
or int
. 您可以通过将对象简单地传递给
str
或int
将其转换为字符串或整数。
for span in item.xpath('.//span[contains(@id, "lblSignal")]'):
callType = int(span.text_content())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.