Python请求模块未在会话中传递参数

Question

I am using am attempting to do a bulk download of a series of PDFs from a site that requires login authentication.我正在尝试从需要登录身份验证的站点批量下载一系列 PDF。 I am able to successfully log in, however, when I attempt a GET request for '/transcripts/transcript.pdf?user_id=3007' but, the request returns the content for '/transcripts/transcript.pdf' .但是，当我尝试对'/transcripts/transcript.pdf?user_id=3007'发出 GET 请求时，我能够成功登录，但是该请求返回了'/transcripts/transcript.pdf'的内容。

Does anyone have any idea why the URL param is not sending?有谁知道为什么 URL 参数没有发送？ Or why it would be rerouted?或者为什么它会被重新路由？

I have tried passing the parameter 'user_id' as data, params, and hardcoded in the URL.我尝试将参数“user_id”作为数据、参数和硬编码在 URL 中传递。

I have removed the actual domain from the strings below just for privacy出于隐私考虑，我已从以下字符串中删除了实际域

with requests.Session() as s:
    login = s.get('<domain>/login/canvas')
    # print the html returned or something more intelligent to see if it's a successful login page.
    print(login.text)
    login_html = lxml.html.fromstring(login.text)
    hidden_inputs = login_html.xpath(r'//form//input[@type="hidden"]')
    form = {x.attrib["name"]: x.attrib["value"] for x in hidden_inputs}
    print("form: ",form)
    form['pseudonym_session[unique_id]']= username 
    form['pseudonym_session[password]']= password
    response = s.post('<domain>/login/canvas',data=form)
    print(response.url, response.status_code) # gets <domain>?login_success=1 200


    # An authorised request.
    data = { 'user_id':'3007'}
    r = s.get('<domain>/transcripts/transcript.pdf?user_id=3007', data=data)
    print(r.url) # gets <domain>/transcripts/transcript.pdf
    print(r.status_code) # gets 200
    with open('test.pdf', 'wb') as f:
        f.write(r.content)

GET response returns /transcripts/transcript.pdf and not /transcripts/transcript.pdf?user_id=3007 GET 响应返回/transcripts/transcript.pdf而不是/transcripts/transcript.pdf?user_id=3007

Answer 1

From the looks of it, you are trying to use canvas.从它的外观来看，您正在尝试使用画布。 I'm pretty sure in canvas, you can bulk download all test attachments.我很确定在画布中，您可以批量下载所有测试附件。

If that's not the case, There are a few things to try:如果不是这种情况，可以尝试以下几点：

after logging in, try typing the url with user_id into a browser.登录后，尝试在浏览器中输入带有 user_id 的 URL。 Does that take you directly to the PDF file or links to one?这是否会将您直接带到 PDF 文件或指向其中的链接？
if so, look at the url, it may simply not display the parameters;如果是这样，查看 url，它可能根本不显示参数； some websites do this, don't worry about it有些网站会这样做，别担心

If not, GET may not be enough;如果没有，GET 可能还不够； perhaps the site uses javascript, etc.也许该网站使用了 javascript 等。

Answer 2

after looking through the '.history' of the request I found a series of 302 redirects.查看请求的“.history”后，我发现了一系列 302 重定向。
The first was to '/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf'第一个是'/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf'
In a desperate attempt, I tried:在绝望的尝试中，我尝试了：
s.get('/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf%3Fuser_id%3D3007') and this still rerouted me a few times but ultimately got me the file I wanted! s.get('/login?force_login=0&target_uri=%2Ftranscripts%2Ftranscript.pdf%3Fuser_id%3D3007')这仍然让我重新路由了几次，但最终得到了我想要的文件！

If anyone has a more elegant solution to this or any resources that I can read I would greatly appreciate it!如果有人对此有更优雅的解决方案或我可以阅读的任何资源，我将不胜感激！

Python请求模块未在会话中传递参数

问题描述

I have removed the actual domain from the strings below just for privacy出于隐私考虑，我已从以下字符串中删除了实际域

2 个解决方案

解决方案1
0 2019-01-25 19:46:16

解决方案2
0 已采纳 2019-01-25 20:26:44

Python请求模块未在会话中传递参数

问题描述

I have removed the actual domain from the strings below just for privacy出于隐私考虑，我已从以下字符串中删除了实际域

2 个解决方案

解决方案1 0 2019-01-25 19:46:16

解决方案2 0 已采纳 2019-01-25 20:26:44

解决方案1
0 2019-01-25 19:46:16

解决方案2
0 已采纳 2019-01-25 20:26:44