简体   繁体   中英

Trouble with data types after scraping a website with lxml and xpath

I'm scraping a website for data and end up pulling out numbers. The issue is when I try to perform logic functions in Python on the data it comes back as

class 'lxml.etree._ElementStringResult'

My question is can I typecast this data somehow into a string or int so I can then do my logic statements?

Here is the code:

callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()

print callType

Here is the output:

76

When I try control statements on the data nothing happens. I think it's because I'm trying logic on incorrect types.

callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()
print type(callType)
print callType

This is my output:

<class 'lxml.etree._ElementStringResult'>
76

So instead of trying to complete control statements with an "int", it is a different type. I've tried typecasting the variable but it remains that same datatype. Hope this helps...

xpath() may return a list of _ElementStringResult s, not plain Python strings. The reason why you might sometimes wish to have _ElementStringResult s is that unlike str s they remember their parents (which they make accessible through the getparent method).

You could convert this to a string or integer by simply passing the object to str or int .

for span in item.xpath('.//span[contains(@id, "lblSignal")]'):
    callType = int(span.text_content())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM