简体   繁体   English

在使用请求和 beautifulsoup 抓取页面时接受 cookies

[英]Accepting cookies while scraping page with requests and beautifulsoup

I did a script that tracks the price of a product on many different pages.我做了一个脚本,在许多不同的页面上跟踪产品的价格。 The problem is that some websites uses cookies and you have to click accept cookies to see the price.问题是某些网站使用 cookies,您必须单击接受 cookies 才能看到价格。

This will probably not help but this is the website it's in swedish so many of you won't understand.这可能无济于事,但这是瑞典语的网站,所以你们中的许多人都不会理解。

How do I accept cookies while web scraping?如何在 web 刮擦时接受 cookies?

There are no cookies being involved in doing a request.没有 cookies 参与请求。 I feel you shouldn't be facing any problem doing a get or a post request.我觉得你不应该在执行 get 或 post 请求时遇到任何问题。

Edit: try this peice of code:编辑:试试这段代码:

r = requests.get('https://www.google.com/')

with open('test.html', 'w') as f:
    f.write(r.text)
    f.close()

Run the test.html file in your web browser and try to see the difference.在 web 浏览器中运行test.html文件并尝试查看差异。 The test.html is what your code sees which is different from what a normal person sees in their web browser with the full GUI. test.html是您的代码所看到的,这与普通人在具有完整 GUI 的 web 浏览器中看到的不同。

When you scrape a site you don't have to accept those cookies.当你抓取一个网站时,你不必接受那些 cookies。 But if you want to accept then you can simply click on the "accept-button" from the website.但是,如果您想接受,则只需单击网站上的“接受按钮”即可。 You can do this with this method:您可以使用以下方法执行此操作:

Get the X-Path with right clicking on the website and inspect the cookie button.右键单击网站获取 X-Path 并检查 cookie 按钮。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM