简体   繁体   中英

Can't login with python requests, even after making a get request first, and setting headers

I am trying to get data from a page. I've tried to read the posts of other people who had the same problem, Making a get request first to get cookies, setting headers, none of it works. When I examine the output of print(soup.title.get_text()) I still end up getting "Log In" as the title returned. The login_data has the same key names as the HTML <input> elements, eg <input name=ctl00$cphMain$logIn$UserName ...> for username and <input name=ctl00$cphMain$logIn$Password ...> for password. Not sure what to do next. I can't use selenium, as I have to execute this script on an EC2 instance that's running a splunk server.

import requests
from bs4 import BeautifulSoup

link = "****"
login_URL = "https://erecruit.elwoodstaffing.com/Login.aspx"
login_data = {
"ctl00$cphMain$logIn$UserName": "****",
"ctl00$cphMain$logIn$Password": "****"
} 



with requests.Session() as session:
    z = session.get(login_URL) 
    session.headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.63 Safari/537.36',
    'Content-Type':'application/json;charset=UTF-8',
}
    post = session.post(login_URL, data=login_data)
    response = session.get(link) 
    html = response.text
    soup = BeautifulSoup(html, "html.parser")
    print(soup.title.get_text())

I actually found the answer.

You can basically just go to the network tab using chrome, and then copy requests as a cURL statement. Then, just use a website or tool to convert the cURL statement to its programming language equivalent (Python, node, java, and so forth).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM