简体   繁体   English

Python 请求 - SAML 登录重定向

[英]Python Requests - SAML Login Redirect

I'm trying to log in to a website from this URL: " https://pollev.com/login ".我正在尝试从此 URL 登录网站:“ https://pollev.com/login ”。 Since I'm using a school email, the portal redirects to the school's login portal and uses that portal to authenticate the login.由于我使用的是学校电子邮件,因此门户重定向到学校的登录门户并使用该门户对登录进行身份验证。 It shows up when you type in a uw.edu email (example: myname@uw.edu).当您输入 uw.edu 电子邮件(例如:myname@uw.edu)时,它就会出现。 After logging in, UW sends a POST request callback to https://www.polleverywhere.com/auth/washington/callback with a SAMLResponse header like this .登录后,UW 向https://www.polleverywhere.com/auth/washington/callback发送一个 POST 请求回调,并带有像这样的 SAMLResponse 标头。 I think I need to simulate the GET request from pollev's login page and then send the login headers to the UW login page, but what I'm doing right now isn't working.我想我需要模拟来自 pollev 登录页面的 GET 请求,然后将登录标头发送到 UW 登录页面,但是我现在所做的不起作用。

Here's my code:这是我的代码:

import requests

with requests.session() as s:
     header_data = {
    'user - agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
                    '(KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
    'referer': 'https://pollev.com/login'
    }
    login_data = {
    'j_username' : 'username',
    'j_password' : 'password',
    '_eventId_proceed' : 'Sign in'
    }

    r = s.get('https://idp.u.washington.edu/idp/profile/SAML2/Redirect/SSO?execution=e2s1',
          headers=header_data, data=login_data)
    print(r.text)

Right now, r.text shows a NoSuchFlowExecutionException html page.现在,r.text 显示一个 NoSuchFlowExecutionException html 页面。 What am I missing?我错过了什么? Logging into the website normally requires a login, password, Referrer, and X-CSRF token which I was able to do, but I don't know how to navigate a redirect for authentication.登录网站通常需要登录名、密码、Referrer 和 X-CSRF 令牌,我可以这样做,但我不知道如何导航重定向以进行身份​​验证。

Old question but I had nearly identical needs and carried on until I solved it.老问题,但我有几乎相同的需求,直到我解决了它。 In my case, which may still be the case of the OP, I have the required credentials.就我而言,OP 可能仍然如此,我拥有所需的凭据。 I am certain this could be made more efficient / pythonic and would greatly appreciate those tips / corrections.我确信这可以提高效率/pythonic,并且非常感谢这些提示/更正。

import re
import requests

# start HTTP request session
s = requests.Session()

# Prepare for first request - This is the ultimate target URL
url1 = '/URL/needing/shibbolethSAML/authentication'
header_data = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36'}

# Make first request
r1 = s.get(url1, headers = header_data)

# Prepare for second request - extract URL action for next POST from response, append header, and add login credentials
ss1 = re.search('action="', r1.text)
ss2 = re.search('" autocomplete', r1.text)
url2 = 'https://idp.u.washington.edu' + r1.text[ss1.span(0)[1]:ss2.span(0)[0]]
header_data.update({'Accept-Encoding': 'gzip, deflate, br', 'Content-Type': 'application/x-www-form-urlencoded'})
cred = {'j_username': 'username', 'j_password':'password', '_eventId_proceed' : 'Sign in'}

# Make second request
r2 = s.post(url2, data = cred)

# Prepare for third request - format and extract URL, RelayState, and SAMLResponse
ss3 = re.search('<form action="',r2.text) # expect only one instance of this pattern in string
ss4 = re.search('" method="post">',r2.text) # expect only one instance of this pattern in string
url3 = r2.text[ss3.span(0)[1]:ss4.span(0)[0]].replace('&#x3a;',':').replace('&#x2f;','/')

ss4 = re.search('name="RelayState" value="', r2.text) # expect only one instance of this pattern in string
ss5 = re.search('"/>', r2.text)
relaystate_value = r2.text[ss4.span(0)[1]:ss5.span(0)[0]].replace('&#x3a;',':')

ss6 = re.search('name="SAMLResponse" value="', r2.text)
ss7 = [m.span for m in re.finditer('"/>',r2.text)] # expect multiple matches with the second match being desired
saml_value = r2.text[ss6.span(0)[1]:ss7[1](0)[0]]

data = {'RelayState': relaystate_value, 'SAMLResponse': [saml_value, 'Continue']}
header_data.update({'Host': 'training.ehs.washington.edu', 'Referer': 'https://idp.u.washington.edu/', 'Connection': 'keep-alive'})

# Make third request
r3 = s.post(url3, headers=header_data, data = data)

# You should now be at the intended URL

You're not going to be successful faking out SAML2 SSO.您不会成功伪造 SAML2 SSO。 The identity provider (IdP) at UW is looking to support an authentication request from the service provider (SP) polleverywhere.com. UW 的身份提供商 (IdP) 正在寻求支持来自服务提供商 (SP) polleverywhere.com 的身份验证请求。 Part of that is verifying the request actually originated from polleverywhere.其中一部分是验证请求实际上源自 polleverwhere。 This could be as simple has requiring SSL connection from polleverywhere, it could be as complicated as requiring an encrypted & signed authentication request.这可能很简单,需要从 polleverywhere 进行 SSL 连接,也可能像需要加密和签名的身份验证请求一样复杂。 Since you don't have those credentials, the resulting response isn't going to be readable.由于您没有这些凭据,因此无法读取生成的响应。 SPs are registered with IdPs. SP 向 IdP 注册。

Now, there may be a different way to sign into polleverywhere -- a different URL which will not trigger an SSO request, but that might be network restricted or require other difficult authentication.现在,可能有一种不同的方式来登录 polleverywhere —— 一个不同的 URL,它不会触发 SSO 请求,但这可能会受到网络限制或需要其他困难的身份验证。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM