Here is the form. The same exact form appears twice in the source.
<form method="POST" action="/login/?tok=sess">
<input type="text" id="usern" name="username" value="" placeholder="Username"/>
<input type="password" id="passw" name="password" placeholder="Password"/>
<input type="hidden" name="ses_token" value="token"/>
<input id="login" type="submit" name="login" value="Log"/>
</form>
I am getting the "action" attribute with this py code
import lxml.html
tree = lxml.html.fromstring(pagesource)
print tree.xpath('//action')
raw_input()
Since there are two forms, it prints both of the attributes
['/login/?session=sess', '/login/?session=sess']
How can I get it to print just one? I only need one, since they're the same exact form.
I also have a second question
how can I get the value of the token? I am talking about this line:
<input type="hidden" name="ses_token" value="token"/>
I try similar code,
import lxml.html
tree = lxml.html.fromstring(pagesource)
print tree.xpath('//value')
raw_input()
However, since is more than one attribute named value, it will print out
['', 'token', 'Log In', '', 'token', 'Log In'] # or something close to that
How can I get just the token? And just one?
Is there a better way to do this?
Use find()
instead of xpath()
, since find()
returns only the first match.
Here's an example based on the code you've provided:
import lxml.html
pagesource = """<form method="POST" action="/login/?session=sess">
<input type="text" id="usern" name="username" value="" placeholder="Username"/>
<input type="password" id="passw" name="password" placeholder="Password"/>
<input type="hidden" name="ses_token" value="token"/>
<input id="login" type="submit" name="login" value="Log In"/>
</form>
<form method="POST" action="/login/?session=sess">
<input type="text" id="usern" name="username" value="" placeholder="Username"/>
<input type="password" id="passw" name="password" placeholder="Password"/>
<input type="hidden" name="ses_token" value="token"/>
<input id="login" type="submit" name="login" value="Log In"/>
</form>
"""
tree = lxml.html.fromstring(pagesource)
form = tree.find('.//form')
print "Action:", form.action
print "Token:", form.find('.//input[@name="ses_token"]').value
Prints:
Action: /login/?session=sess
Token: token
Hope that helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.