[英]Python 2.7, requests, login into onlydomains.com site
几天后,我尝试登录www.onlydomains.com
网站,将我的域名列表检索到一个脚本中。 我已经有这样的事情了:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests, sys, re, whois
from bs4 import BeautifulSoup
def onlydomains():
with requests.Session() as c:
PASSWORD = 'my%password'
USERNAME = 'my_username'
URL = 'https://www.onlydomains.com/account/login'
c.get(URL)
soup = BeautifulSoup(c.get(URL).text, "lxml")
csrf = soup.find("input", value=True)["value"]
login_data = {
'csrfToken' : csrf,
'username' : USERNAME,
'password' : PASSWORD,
'submit' : 'Submit',}
r = c.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'})
r = c.get('https://onlydomains.secure-admin.com/domain/index')
print r.text
onlydomains()
它并不适合我,因为我总是得到
> ./onlydomains.py
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8" /><title>Login / Sign Up - OnlyDomains</title>
我有什么想法吗?
如果您查看从帖子返回的内容,您可以看到window.location = some_url
:
<script type="text/javascript">
$(document).ready(function(){
setTimeout(function(){
window.location = 'https://onlydomains.secure-admin.com/dashboard/index?_srs_=v42oadi4cAuxIM4PHc5IdgU%5CdXd3AjswsOraTLjynso%3D';;
},1000);
});
</script>
您可以使用它来访问该页面:
patt = re.compile("window.location\s+=\s+'(http.*)'")
with requests.Session() as s:
PASSWORD = 'user'
USERNAME = "pass"
URL = 'https://www.onlydomains.com/account/login'
soup = BeautifulSoup(s.get(URL).text, "lxml")
csrf = soup.select_one("input[name=csrfToken]")["value"]
login_data = {
'csrfToken' : csrf,
'username' : USERNAME,
'password' : PASSWORD}
r = c.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'})
url = patt.search(r.text).group(1)
r = s.get(url).text
print(r)
如果我们运行代码并从主要内容中打印data-original-title
属性,您可以看到我们位于dashborad页面:
In [5]: with requests.Session() as s:
...: PASSWORD = 'xxxxxx'
...: USERNAME = "xxxxxxxxxx"
...: URL = 'https://www.onlydomains.com/account/login'
...: soup = BeautifulSoup(c.get(URL).text, "lxml")
...: csrf = soup.select_one("input[name=csrfToken]")["value"]
...: login_data = {
...: 'csrfToken' : csrf,
...: 'username' : USERNAME,
...: 'password' : PASSWORD}
...: r = s.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'})
...: url = patt.search(r.text).group(1)
...: r = s.get(url).text
...: soup = BeautifulSoup(r,"lxml")
...: print(soup.select_one("h1.PageTitle.visible-xs i.fa.fa-info-circle")["data-original-title"])
...:
Welcome to your Dashboard! Here you have a general overview of what's happening and how to manage your domain assets.
我认为解决问题的最佳方法是使用selenium(我记得做类似于你想用BS做什么,但我不记得现在怎么办)
from selenium import webdriver
chromedriver = 'C:\\chromedriver.exe'
browser = webdriver.Chrome(chromedriver)
browser.get('http://www.example.com')
username = browser.find_element_by_name('username')
username.send_keys('user1')
password = browser.find_element_by_name('password')
password.send_keys('secret')
form = browser.find_element_by_id('loginForm')
form.submit()
这将使您能够加载应包含您想要的信息的下一页:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.