[英]Why doesn't my program stay logged into a website after connecting?
I am trying to scrape information about applicants for jobs but after s.post(login_url, data=payload)
the session gets reset and the program no longer has access to the website content.我正在尝试抓取有关求职者的信息,但在s.post(login_url, data=payload)
之后 session 被重置并且程序不再可以访问网站内容。 I have tested it with just the logging in and it works fine, but when I try to access interviews_url = ('https://www.sparkhire.com/company/interviews')
it logs me out.我仅通过登录对其进行了测试,并且运行良好,但是当我尝试访问interviews_url = ('https://www.sparkhire.com/company/interviews')
时,它会将我注销。 Am I doing something wrong?难道我做错了什么?
from bs4 import BeautifulSoup
import requests
import token_scraper
login_url = ('https://www.sparkhire.com/login')
interviews_url = ('https://www.sparkhire.com/company/interviews')
payload = {
'_token':token_scraper.token,
'email':'censored',
'password':'censored'
}
with requests.session() as s:
s.post(login_url, data=payload)
r = s.get(interviews_url)
soup = BeautifulSoup(s.content, 'html.parser')
print(soup)
from bs4 import BeautifulSoup
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0'
}
def get_soup(content):
return BeautifulSoup(content, 'lxml')
def main(url):
with requests.Session() as req:
req.headers.update(headers)
r = req.get(url)
soup = get_soup(r.text)
data = {
"_token": soup.select_one('input[name=_token]')['value'],
"email": "any@any.com",
"password": "yourpass"
}
req.post(url, data=data)
r = req.get('https://www.sparkhire.com/company/interviews')
with open('view.html', 'wb') as f:
f.write(r.content)
main('https://www.sparkhire.com/login')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.