The goal of this function is to check whether an Amazon item is unavailable or not.
def check(url):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
page = requests.get(url, headers = headers)
doc = html.fromstring(page.content)
XPATH_AVAILABILITY = '//div[@id ="availability"]//text()'
RAw_AVAILABILITY = doc.xpath(XPATH_AVAILABILITY)
AVAILABILITY = ''.join(RAw_AVAILABILITY).strip()
if any(re.match(r'unavailable', str(AVAILABILITY), re.IGNORECASE)):
return "UNAVAILABLE"
else:
return "AVAILABLE"
I checked the type()
of the AVAILABILITY
variable (it's string) and it looks like this when the item is unavailable:
Currently unavailable.
We don't know when or if this item will be back in stock.
and like this (type: string) when it's available:
In Stock.
or In stock.
That's why I opted for the regex for detection of "unavailable" in the output. But the error says:
File "scra.py", line 68, in
if any(re.match(r'unavailable', check(i), re.IGNORECASE)):
TypeError: 'NoneType' object is not iterable
It never outputs a "None" ever that's why I'm surprised. Please help me solve this.
any(x)
iterates over x
and returns True
if it finds an element that evaluates true, or False
if it gets to the end.
re.match
returns either a Match
object if a matches is found or None
.
Your content must not match the regular expression, re.match
returns None
and any
can't iterate over it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.