[英]Logging into website and scraping data
The website I am trying to log in to is https://realitysportsonline.com/RSOLanding.aspx . 我要登录的网站是https://realitysportsonline.com/RSOLanding.aspx 。 I can't seem to get the login to work since the process is a little different to a typical site that has a login specific page.
我似乎无法使登录正常工作,因为该过程与具有特定于登录页面的典型站点有些不同。 I haven't got any errors, but the log in action doesn't work, which then causes the main to redirect to the homepage.
我没有任何错误,但是登录操作不起作用,然后导致主体重定向到首页。
import requests
url = "https://realitysportsonline.com/RSOLanding.aspx"
main = "https://realitysportsonline.com/SetLineup_Contracts.aspx?leagueId=3000&viewingTeam=1"
data = {"username": "", "password": "", "vc_btn3 vc_btn3-size-md vc_btn3-shape-rounded vc_btn3-style-3d vc_btn3-color-danger" : "Log In"}
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
'Referer': 'https://realitysportsonline.com/RSOLanding.aspx',
'Host': 'realitysportsonline.com',
'Connection': 'keep-alive',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate, br',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'}
s = requests.session()
s.get(url)
r = s.post(url, data, headers=header)
page = requests.get(main)
First of all, you create a session and assuming your POST
request worked, you then request an authorised page without using your previously created session. 首先,您创建一个会话,并假定您的
POST
请求有效,然后在不使用先前创建的会话的情况下请求一个授权页面。
You need to make the request with the s
object you created like so: page = s.get(main)
您需要使用创建的
s
对象发出请求,如下所示: page = s.get(main)
However, there were also a few issues with your POST
request. 但是,您的
POST
请求也存在一些问题。 You were making a request to the home page instead of the /Login
route. 您是在向主页而不是
/Login
路由发出请求。 You were also missing the Content-Type
header. 您还缺少
Content-Type
标头。
import requests
url = "https://realitysportsonline.com/Services/AccountService.svc/Login"
main = "https://realitysportsonline.com/LeagueSetup.aspx?create=true"
payload = {"username":"","password":""}
headers = {
'Content-Type': "text/json",
'Cache-Control': "no-cache"
}
s = requests.session()
response = s.post(url, json=payload, headers=headers)
page = s.get(main)
PS your main
request url redirects to the homepage, even with a valid session (at least for me). PS,即使有有效的会话,您的
main
请求网址也会重定向到首页(至少对我而言)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.