[英]Checking if a URL contains all elements in a list in Python
I'm working on a project where I want to produce a list of all the Ebay item URLs that match a given set of keywords and a set price. 我正在一个项目中,我想生成一个与给定的关键字集和价格相匹配的所有Ebay商品URL的列表。 So far, I've managed to make it work by first creating a URL in the format EBay requires using the user's input keywords and price, and then returning only the URLs from that page that include /itm/ as these will be the item URLs.
到目前为止,我设法使其工作起来,方法是首先以EBay要求使用用户的输入关键字和价格创建格式的URL,然后仅从该页面返回包含/ itm /的URL,因为这些将是商品URL 。 However, I run into a problem when the keywords become too specific.
但是,当关键字变得过于具体时,我遇到了一个问题。 When Ebay turns up less than 10 results for a given search, it will also provide you with some links to 'related products' that match some but not all of your keywords.
当Ebay进行给定搜索的结果少于10个时,它还会为您提供指向“相关产品”的某些链接,这些链接与您的部分关键字(但不是全部)匹配。 I don't want to return the links to these related products.
我不想返回这些相关产品的链接。 I've tried to take this into account by splitting the input user keywords into a list and then putting an If statement where the URL has to contain all the elements in this list, but that didn't work, and I get this error message: TypeError: 'in ' requires string as left operand, not bool.
我试图通过将输入的用户关键字拆分为一个列表,然后将If语句放置在URL必须包含此列表中所有元素的位置来考虑这一点,但没有成功,并且出现了此错误消息:TypeError:'in'需要将字符串作为左操作数,而不是布尔值。
See my code below. 请参阅下面的代码。 Any help would be appreciated!
任何帮助,将不胜感激!
import requests
from bs4 import BeautifulSoup
import cherrypy
user_keyword = input("What would you like to search for? ")
print(user_keyword)
keywords_url = user_keyword.replace(' ', '%20')
user_price = input("What is your maximum price? ")
url_part1 = 'http://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw='
url_part2 = '&_dcat=55793&rt=nc&_mPrRngCbx=1&_udlo&_udhi='
url = (url_part1 + keywords_url + url_part2 + user_price)
r= requests.get(url)
data = r.text
soup = BeautifulSoup(data, "html.parser")
for link in soup.find_all('a'):
if link.has_attr('href'):
if '/itm/' in link['href']: #Makes sure we only get actual item links
if all(user_keyword.split(' ')) in link['href']:
print(link['href'])
The Python keyword all
is basically and:ing a list (or similar) of logical values. Python关键字
all
基本上是一个逻辑值列表(或类似列表)。 Thus, you must first test for each word in user_keyword
individually and then use all
on the final result: 因此,您必须首先分别测试
user_keyword
每个单词,然后在最终结果中all
使用:
if all(word in link['href'] for word in user_keyword.split(' ')):
This piece of code uses list comprehension to generate a list (or rather a generator, since there are no list markers []) of boolean values, where all the values are True
if the link contains all the user keywords. 这段代码使用列表理解来生成布尔值的列表(或更确切地说,是生成器,因为没有列表标记[]),如果链接包含所有用户关键字,则所有值均为
True
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.