简体   繁体   English

检索重定向到python中的登录页面的页面

[英]retrieving a page that redirects to a login page within python

I am having a rough time gathering the data from a website programatically. 我在以编程方式从网站收集数据的过程很艰难。 I am attempting to utilize this example to log into the server, but it is not working since I think that this is the wrong type of login. 我试图利用该示例登录服务器,但是由于我认为这是错误的登录类型,因此无法正常工作。

The site I am trying to access redirects to a login page when I attempt to download the data to parse the html. 当我尝试下载数据以解析html时,我尝试访问的网站将重定向到登录页面。

This is the URL: 这是URL:

https://mtred.com/rewards.html https://mtred.com/rewards.html

and heres the code: 这是代码:

# build opener with HTTPCookieProcessor
o = urllib2.build_opener( urllib2.HTTPCookieProcessor() )
urllib2.install_opener( o )
# assuming the site expects 'user' and 'pass' as query params
p = urllib.urlencode( { 'UserLogin_username': 'mylogin', 'UserLogin_password': 'mypass' } )
# perform login with params
f = o.open( 'http://www.mtred.com/user/login.html',  p )
data = f.read()
f.close()
# second request should automatically pass back any
# cookies received during login... thanks to the HTTPCookieProcessor
f = o.open( 'https://www.mtred.com/rewards.html',p )
data = f.read()
print data

it kicks me to the login page again when I attempt to open rewards. 当我尝试打开奖励时,它将再次将我踢到登录页面。 I am trying to pass the rewards to do some statistics automatically since this information isn't available via public API 我正在尝试通过奖励自动执行一些统计信息,因为该信息无法通过公共API获得

One issue that pops out is that you're passing in the id values of the form parameters for the login, not the name parameters. 出现的一个问题是您要传递登录信息的表单参数的id值,而不是name参数。 Eg, in the user name form field, you are specifying UserLogin_username , but the name of that field as expected by the server is "UserLogin[username]" 例如,在用户名表单字段中,您指定UserLogin_username ,但是服务器期望的该字段的名称为"UserLogin[username]"

<label for="UserLogin_username" class="required">
username or email <span class="required">*</span></label>       
<input name="UserLogin[username]" id="UserLogin_username" type="text" />    </div>

<div class="row">
<label for="UserLogin_password" class="required">password <span class="required">*</span></label>   
<input name="UserLogin[password]" id="UserLogin_password" type="password" /> </div>

Since the server isn't getting back parameters that it knows about, the behavior you're seeing is not unexpected. 由于服务器没有取回它知道的参数,因此您所看到的行为并不意外。 (Not saying that there aren't other problems here; haven't looked.) (不是说这里没有其他问题;没看过。)

you must inclue in ur post data the value named "YII_CSRF_TOKEN" that included in html form . 您必须在您的发布数据中包含html表单中包含的名为“ YII_CSRF_TOKEN”的值。 or use " ClientForm " lib 或使用“ ClientForm ”库

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM