请求自动登录的Python Web抓取不起作用

Question

I have been attempting to web scrape a website using the python requests module and have needed to login to the site to retrieve the data I want. 我一直在尝试使用python requests模块通过网络抓取网站，并且需要登录该网站以检索所需的数据。 I have looked around everywhere but cannot find out why it is not working. 我到处环顾四周，但找不到原因。 Here is my code so far: 到目前为止，这是我的代码：

import requests
import bs4 as bs

login_url = "__withheld__"
target_url = "__withheld__"

login_data = { "username": "my_username", "password": "my_password"}

with requests.Session() as s:
    page = s.get(login_url)
    page_login = s.post(login_url, data = login_data)
    page = s.get(target_url)
    final_page = bs.BeautifulSoup(page.content, 'lxml')
    print(final_page.title)

This is the html of the password box: 这是密码框的html：

<input name="username" type="text" id="username" class="metro-input" placeholder="Username" value="">
<span id="username-error" class=""></span>
<label class="ie789Only"> Password</label>
<input name="password" type="password" id="password" class="metro-input" placeholder="Password">
<input type="submit" name="button1" value="Sign in" id="button1" class="metro-button">

I believe that it may have to do with the website requiring the user to click the button though I could find no solution. 我认为这可能与要求用户单击按钮的网站有关，尽管我找不到任何解决方案。 I also tried looking for any post forms in the developer console when I login myself but have found no definitive form outlining the password/username. 当我登录自己时，我还尝试在开发人员控制台中查找任何张贴表格，但没有找到明确的表格来概述密码/用户名。 Any help is appreciated. 任何帮助表示赞赏。

Update Here is the link to a site run by the same company (privacy) with the same security features if this is any help: https://ashwood-vic.compass.education/login.aspx?sessionstate=disabled 更新如果有帮助，以下是指向具有相同安全功能的同一公司（隐私）运营的网站的链接： https : //ashwood-vic.compass.education/login.aspx?sessionstate=disabled

Answer 1

Can you try this below code once 您可以一次尝试以下代码吗

import requests
import bs4 as bs
username = 'username of the site'
password = 'password of the site'

req = requests.get(login_url, auth=(username, password))
final_page = bs.BeautifulSoup(req.content, 'lxml')
print(final_page.title)

- Please refere this http://docs.python-requests.org/en/master/user/authentication/#basic-authentication -请参阅此http://docs.python-requests.org/en/master/user/authentication/#basic-authentication

请求自动登录的Python Web抓取不起作用

问题描述

1 个解决方案

解决方案1
0 2017-11-04 09:27:09

请求自动登录的Python Web抓取不起作用

问题描述

1 个解决方案

解决方案1 0 2017-11-04 09:27:09

解决方案1
0 2017-11-04 09:27:09