Python Requests 响应与 chrome 响应不同

Question

I need to download about 100 captcha images from a particular website.我需要从特定网站下载大约 100 张验证码图像。 My code in summary is:我的代码总结是：

1- download the page 1-下载页面

2- search for the captcha image URL (using re) and download it 2- 搜索验证码图片 URL（使用 re）并下载

3- :( the downloaded image is different from what I'd see in browser. I guess there is a parameter in session or in the request (get or post) I need to set, which I haven't. 3- :( 下载的图像与我在浏览器中看到的不同。我想在会话或请求（获取或发布）中有一个参数我需要设置，我没有。

import requests
import re
import time
s = requests.Session()
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

#download this page and look for the url of the captcha image
page = s.get('http://www.rrk.ir/News/ShowOldNews.aspx?Code=1', headers=headers)
result = re.search('img id="imgCaptcha" src="..(.*)"', page.content.decode('utf-8'))
img_url = 'http://www.rrk.ir' + result.group(1).split('"')[0]

print(img_url)
#download the image and save it to a file
img = s.get(img_url, headers=headers)
img_file_name =  './a'  + '.jpg'
with open(img_file_name, 'wb') as fout:
    fout.write(img.content)

s.close()
#:( the downloaded file is different from what I see in Chrome.

How can I find out what setting I'm missing?如何找出我缺少的设置？

Update 1 : As suggested, added the custom headers but it didn't help.更新 1 ：按照建议，添加了自定义标题，但没有帮助。

Answer 1

Now I'm struggling with a similar problem with authorization.现在我正在努力解决类似的授权问题。 If you have similar problems, you need to turn off automatic redirect in the Requests(allow_redirects=False) and check if there is a cascade of requests in the browser developer tools - perhaps the first request causes redirects between which additional parameters are generated in data/json new payload, or create new headers/cookies..如果你有类似的问题，你需要在 Requests(allow_redirects=False) 中关闭自动重定向，并检查浏览器开发者工具中是否存在级联请求 - 可能第一个请求会导致重定向，在这些重定向之间会生成额外的参数/json 新的有效负载，或创建新的标头/cookies..

s = requests.Session()
resp = s.get(url_here, headers=headers, allow_redirects=False)
if resp.status_code == 302:
    print('Redirect!', resp.headers. resp.cookies, sep="\n")

Сheck cookies and response headers! Сheck cookie 和响应标头！ Also you can use:您也可以使用：

print(resp.history)

Python Requests 响应与 chrome 响应不同

问题描述

1 个解决方案

解决方案1
1 2022-06-27 07:32:01

Python Requests 响应与 chrome 响应不同

问题描述

1 个解决方案

解决方案1 1 2022-06-27 07:32:01

解决方案1
1 2022-06-27 07:32:01