简体   繁体   English

使用登录进行 Python 网页抓取

[英]Python web scraping with login

I'm trying to login through a site that is password protected in order to get access to a protected page, i have the email and password names along with the csrf-token.But when i try to access the protected page it doesnt allow me and redirects me back to the login.Any help would be awesome!The site im trying to access is.我正在尝试通过受密码保护的站点登录以访问受保护页面,我有电子邮件和密码名称以及 csrf 令牌。但是当我尝试访问受保护页面时,它不允许我并将我重定向回登录。任何帮助都会很棒!我试图访问的站点是。

https://www.usertesting.com/users/sign_in https://www.usertesting.com/users/sign_in

import requests
from lxml import html

session_requests = requests.session()

login_url = "https://www.usertesting.com/users/sign_in"
result = session_requests.get(login_url)

tree = html.fromstring(result.text)
authenticity_token = list(set(tree.xpath("//meta[@name='csrf-token']/@content")))[0]

userInfo = {
    "user[email]": "email", 
    "user[password]": "password", 
    "csrf-token": authenticity_token
}

result = session_requests.post(
    login_url, 
    data = userInfo, 
    headers = dict(referer=login_url)
)

url = 'https://www.usertesting.com/my_dashboard'

result = session_requests.get(
    url, 
    headers = dict(referer = url)
)

print result.content

Try taking a look at this https://kazuar.github.io/scraping-tutorial/ for the answer you're looking for.尝试查看此https://kazuar.github.io/scraping-tutorial/以获得您正在寻找的答案。 Summarizing, you're going to need to inspect the web page and before you begin your full scraping program you should write another function that will enter the username, password, then enter the site.总而言之,您将需要检查网页,在开始完整的抓取程序之前,您应该编写另一个函数,输入用户名、密码,然后进入站点。 After that completes, begin the full scripting.完成后,开始完整的脚本编写。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM