[英]How to check if an element exists in xpath and return null/no value if it doesn't?
I am trying to scrape data from a website given a list of urls which is stored in "data".我试图从一个网站上抓取数据,给出一个存储在“数据”中的 url 列表。
I noticed some urls don't have the xpath to "og_price" and "discount" and I get an error of NoSuchElement or just straight saying "og_price" and "discount" is not defined presumably because the certain url doesn't have that xpath.我注意到一些 url 没有“og_price”和“discount”的 xpath 并且我收到 NoSuchElement 的错误或者只是直接说“og_price”和“discount”没有定义大概是因为某些 url 没有那个 xpath .
I want to check if an xpath exists in a url (which I tried to do with try-except) and return a null value or just string "no" but I am stuck on how to do that as I later call ".text" on "og_price" and "discount" which will say 'str' object has no attribue '.text'我想检查一个 url 中是否存在 xpath(我试图用 try-except 来做)并返回一个空值或只是字符串“no”,但我在稍后调用“.text”时被困在如何做到这一点上在“og_price”和“discount”上会说“str”对象没有属性“.text”
for url in data:
driver.get(url)
item_name = driver.find_element_by_xpath('//span[@id="productTitle"]')
brand_name = driver.find_element_by_xpath('//*[@class="a-spacing-small"][.//*[contains(.,"Brand")]]/td[@class="a-span9"]/span')
price = driver.find_element_by_xpath('//div[@class="a-section a-spacing-micro"]/span[@id="price_inside_buybox"]')
try:
og_price = driver.find_element_by_xpath('//span[@class="priceBlockStrikePriceString a-text-strike"]')
discount = driver.find_element_by_xpath('//td[@class="a-span12 a-color-price a-size-base priceBlockSavingsString"]')
except NoSuchElementException:
og_price = null
discount = null
row = { 'Item Name': item_name.text,
'Brand Name': brand_name.text,
'Price': price.text,
'Original Price': og_price.text,
'URL': url
}
@Harry Kim the proper way of checking if a element exists is by wraping the check in the try/catch block as you did above. @Harry Kim 检查元素是否存在的正确方法是像上面那样将检查包装在 try/catch 块中。 To resolve the issue of null object exception, you can have a if check before assigning the element to the row map.
为了解决空对象异常的问题,您可以在将元素分配给行映射之前进行 if 检查。 Something like,
就像是,
try:
og_price = driver.find_element_by_xpath('//span[@class="priceBlockStrikePriceString a-text-strike"]')
discount = driver.find_element_by_xpath('//td[@class="a-span12 a-color-price a-size-base priceBlockSavingsString"]')
except NoSuchElementException:
og_price = None
discount = None
row = { 'Item Name': item_name.text,
'Brand Name': brand_name.text,
'Price': price.text,
'URL': url
}
if og_price is not None:
row["Original Price"] = og_price.text
else:
row["Original Price"] = "N/A"
Also it is a good idea to use function like find_element_by_css_selector or find_element_by_id if you have a well defined css label.此外,如果您有一个明确定义的 css 标签,那么使用 find_element_by_css_selector 或 find_element_by_id 之类的函数也是一个好主意。 We go for Xpath only when the target element does have a proper id or css label.
仅当目标元素确实具有正确的 id 或 css 标签时,我们才使用 Xpath。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.