[英]bs4 filtering with python
i'm trying to write a script that checks the steam store, and i'm having a problem with filtering out all of the listings that don't have a discount within their code.我正在尝试编写一个脚本来检查 Steam 商店,但我在过滤掉代码中没有折扣的所有列表时遇到了问题。 i want to keep only the listings with the span tag and the
<span>-percentage</span>
within them, and not the one without.我只想保留带有 span 标签和
<span>-percentage</span>
的列表,而不是没有的。 here's my code:这是我的代码:
from urllib.request import urlopen
from datetime import date
import requests as rq
inp = str(input('what would you like to search up?'))
w = ('https://store.steampowered.com/search/?term=' + inp)
page = rq.get(w)
soup = bsoup(page.content, 'html.parser')
soup.prettify()
sales = soup.find_all('div', class_="responsive_search_name_combined")
for sale in sales:
p = soup.find('div', class_="col search_price responsive_secondrow")
d = soup.find_all('div', class_="col search_discount responsive_secondrow")
n = soup.find('span', class_="title")
if None in (d, n, p):
continue
print(d)
and the output (containing the things i want to filter out/the things i want to keep)和 output(包含我想过滤掉的东西/我想保留的东西)
<span>-16%</span>
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
<span>-19%</span>
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
</div>, <div class="col search_discount responsive_secondrow">
etc etc. i've tried replacing d = soup.find_all('div', class_="col search_discount responsive_secondrow")
with d = soup.find_all('span', string="-16%")
to see if that would work and it didnt.等等等等 我试过用 d =
d = soup.find_all('span', string="-16%")
替换d = soup.find_all('div', class_="col search_discount responsive_secondrow")
看看是否会这样工作,但没有。 i want to keep the span tags but not the div tags could anyone help with this?我想保留 span 标签而不是 div 标签 任何人都可以帮忙吗?
You can simply add a try-except
block to the last for
loop to solve your problem.您可以简单地在最后一个
for
循环中添加一个try-except
块来解决您的问题。 Here is the full code:这是完整的代码:
from urllib.request import urlopen
from datetime import date
import requests as rq
from bs4 import BeautifulSoup as bsoup
inp = str(input('what would you like to search up?'))
w = ('https://store.steampowered.com/search/?term=' + inp)
page = rq.get(w)
soup = bsoup(page.content, 'html.parser')
soup.prettify()
sales = soup.find_all('div', class_="responsive_search_name_combined")
final = []
for sale in sales:
p = soup.find('div', class_="col search_price responsive_secondrow")
d = soup.find_all('div', class_="col search_discount responsive_secondrow")
n = soup.find('span', class_="title")
try:
for element in d:
span = element.span
if span:
final.append(span.text)
except:
pass
print(final)
Output: Output:
what would you like to search up?>? among us
['-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%', '-10%', '-25%']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.