简体   繁体   English

检查URL是否包含Python列表中的所有元素

[英]Checking if a URL contains all elements in a list in Python

I'm working on a project where I want to produce a list of all the Ebay item URLs that match a given set of keywords and a set price. 我正在一个项目中,我想生成一个与给定的关键字集和价格相匹配的所有Ebay商品URL的列表。 So far, I've managed to make it work by first creating a URL in the format EBay requires using the user's input keywords and price, and then returning only the URLs from that page that include /itm/ as these will be the item URLs. 到目前为止,我设法使其工作起来,方法是首先以EBay要求使用用户的输入关键字和价格创建格式的URL,然后仅从该页面返回包含/ itm /的URL,因为这些将是商品URL 。 However, I run into a problem when the keywords become too specific. 但是,当关键字变得过于具体时,我遇到了一个问题。 When Ebay turns up less than 10 results for a given search, it will also provide you with some links to 'related products' that match some but not all of your keywords. 当Ebay进行给定搜索的结果少于10个时,它还会为您提供指向“相关产品”的某些链接,这些链接与您的部分关键字(但不是全部)匹配。 I don't want to return the links to these related products. 我不想返回这些相关产品的链接。 I've tried to take this into account by splitting the input user keywords into a list and then putting an If statement where the URL has to contain all the elements in this list, but that didn't work, and I get this error message: TypeError: 'in ' requires string as left operand, not bool. 我试图通过将输入的用户关键字拆分为一个列表,然后将If语句放置在URL必须包含此列表中所有元素的位置来考虑这一点,但没有成功,并且出现了此错误消息:TypeError:'in'需要将字符串作为左操作数,而不是布尔值。

See my code below. 请参阅下面的代码。 Any help would be appreciated! 任何帮助,将不胜感激!

import requests
from bs4 import BeautifulSoup
import cherrypy

user_keyword = input("What would you like to search for? ")

print(user_keyword)


keywords_url = user_keyword.replace(' ', '%20')


user_price = input("What is your maximum price? ")

url_part1 = 'http://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw='
url_part2 = '&_dcat=55793&rt=nc&_mPrRngCbx=1&_udlo&_udhi='

url = (url_part1 + keywords_url + url_part2 + user_price)


r= requests.get(url)

data = r.text

soup = BeautifulSoup(data, "html.parser")

for link in soup.find_all('a'):
    if link.has_attr('href'):
        if '/itm/' in link['href']: #Makes sure we only get actual item links
          if all(user_keyword.split(' ')) in link['href']: 
              print(link['href'])

The Python keyword all is basically and:ing a list (or similar) of logical values. Python关键字all基本上是一个逻辑值列表(或类似列表)。 Thus, you must first test for each word in user_keyword individually and then use all on the final result: 因此,您必须首先分别测试user_keyword每个单词,然后在最终结果中all使用:

if all(word in link['href'] for word in user_keyword.split(' ')):

This piece of code uses list comprehension to generate a list (or rather a generator, since there are no list markers []) of boolean values, where all the values are True if the link contains all the user keywords. 这段代码使用列表理解来生成布尔值的列表(或更确切地说,是生成器,因为没有列表标记[]),如果链接包含所有用户关键字,则所有值均为True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM