Trouble with data types after scraping a website with lxml and xpath

Question

I'm scraping a website for data and end up pulling out numbers. The issue is when I try to perform logic functions in Python on the data it comes back as

class 'lxml.etree._ElementStringResult'

My question is can I typecast this data somehow into a string or int so I can then do my logic statements?

Here is the code:

callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()

print callType

Here is the output:

When I try control statements on the data nothing happens. I think it's because I'm trying logic on incorrect types.

callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()
print type(callType)
print callType

This is my output:

<class 'lxml.etree._ElementStringResult'>
76

So instead of trying to complete control statements with an "int", it is a different type. I've tried typecasting the variable but it remains that same datatype. Hope this helps...

Answer 1

xpath() may return a list of _ElementStringResult s, not plain Python strings. The reason why you might sometimes wish to have _ElementStringResult s is that unlike str s they remember their parents (which they make accessible through the getparent method).

You could convert this to a string or integer by simply passing the object to str or int .

for span in item.xpath('.//span[contains(@id, "lblSignal")]'):
    callType = int(span.text_content())

Trouble with data types after scraping a website with lxml and xpath

Question

1 answers

solution1
5 ACCPTED 2015-03-18 19:14:03

Trouble with data types after scraping a website with lxml and xpath

Question

1 answers

solution1 5 ACCPTED 2015-03-18 19:14:03

solution1
5 ACCPTED 2015-03-18 19:14:03