简体   繁体   English

Python-美丽汤选择仅返回[]

[英]Python - Beautiful Soup Select only returning []

I am currently learning from a Python tutorial from Udemy (total newbie to Python). 我目前正在学习Udemy的Python教程(对Python来说是新手)。 I am currently at a Beautiful Soup section where we are busy with an exercise to scrape the price off the author's book on Amazon. 我目前在“美丽的汤”部分,我们在忙于练习以减少作者在亚马逊上的书的价格。 My code is below: 我的代码如下:

import bs4, requests
url = 'https://www.amazon.com/Automate-Boring-Stuff-Python-Programming/dp/1593275994/'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

response = requests.get(url, headers=headers)
response.raise_for_status()
soup = bs4.BeautifulSoup(response.text, 'html.parser')
soup.select('#addToCart > a > h5 > div > div.a-column.a-span4.a-text-right.a-span-last > span.a-size-medium.a-color-price.header-price')

When I inspect the path of the element of the price, I can see this: 当我检查价格元素的路径时,可以看到以下内容:

<span class="a-size-medium a-color-price header-price"> 


            $25.45



    </span>

However when I copy and paste it by the soup.select and run the python command, I am only returned with a [] ie 2 square brackets. 但是,当我通过soup.select复制并粘贴它并运行python命令时,仅返回带有[]即2个方括号的返回。 I should be getting the contents of the second code box. 我应该得到第二个代码框的内容。

UPDATE: During the period of which I was typing the question, it did display the result correctly, the contents of the box with $25.45, but 5 minutes later it went back to getting the result of the [] brackets only. 更新:在我输入问题的期间,它确实正确显示了结果,包装盒中的内容为$ 25.45,但是5分钟后,它又回到了仅包含[]方括号的结果。 I am behind a proxy, and have tried without going through a proxy, with no change in results. 我在代理服务器后面,并且尝试不通过代理服务器,结果没有变化。 I dont get any error either when doing response.raise_for_status() . 我在执行response.raise_for_status()时也没有收到任何错误。 Please can some one assist? 请能帮个忙吗?

(Remember that I don't intend to screen scrape any commercial site out there, I would very much like to apply my learnings to in-house scenarios) (请记住,我不打算在屏幕上刮擦任何商业网站,我非常想将自己的学习应用于内部场景)

Thank you! 谢谢!

You are over-complicating your CSS selector and making it fragile - heavily dependent on the page layout. 使CSS选择器过于复杂 ,使其变得脆弱-在很大程度上取决于页面布局。 You don't have to go through the complete parent-child chain to locate an element. 您不必遍历完整的父子链来查找元素。 Choose the most reliable, readable and appropriate points you can base your locator on. 选择可以作为定位器依据的最可靠,最易读和最合适的点。 For instance, in this case, the following works for me: 例如,在这种情况下,以下对我有用:

soup.select('#addToCart .header-price')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM