繁体   English   中英

Python 2.7,请求,登录到onlydomains.com网站

[英]Python 2.7, requests, login into onlydomains.com site

几天后,我尝试登录www.onlydomains.com网站,将我的域名列表检索到一个脚本中。 我已经有这样的事情了:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import requests, sys, re, whois
from bs4 import BeautifulSoup

def onlydomains():
    with requests.Session() as c:
        PASSWORD = 'my%password'
        USERNAME = 'my_username'
        URL = 'https://www.onlydomains.com/account/login'
        c.get(URL)
        soup = BeautifulSoup(c.get(URL).text, "lxml")

        csrf = soup.find("input", value=True)["value"]

    login_data = {
        'csrfToken' : csrf,
        'username' : USERNAME,
        'password' : PASSWORD,
        'submit' : 'Submit',}

    r = c.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'})
    r = c.get('https://onlydomains.secure-admin.com/domain/index')
    print r.text

onlydomains()

它并不适合我,因为我总是得到

 > ./onlydomains.py

    <!DOCTYPE html><html lang="en"><head><meta charset="utf-8" /><title>Login / Sign Up - OnlyDomains</title>

我有什么想法吗?

如果您查看从帖子返回的内容,您可以看到window.location = some_url

    <script type="text/javascript">
                $(document).ready(function(){

                    setTimeout(function(){

                            window.location = 'https://onlydomains.secure-admin.com/dashboard/index?_srs_=v42oadi4cAuxIM4PHc5IdgU%5CdXd3AjswsOraTLjynso%3D';;


                    },1000);
                });
            </script>

您可以使用它来访问该页面:

  patt = re.compile("window.location\s+=\s+'(http.*)'")

  with requests.Session() as s:
        PASSWORD = 'user'
        USERNAME = "pass"
        URL = 'https://www.onlydomains.com/account/login'
        soup = BeautifulSoup(s.get(URL).text, "lxml")
        csrf = soup.select_one("input[name=csrfToken]")["value"]

    login_data = {
        'csrfToken' : csrf,
        'username' : USERNAME,
        'password' : PASSWORD}


    r = c.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'})

    url = patt.search(r.text).group(1)
    r = s.get(url).text
    print(r)

如果我们运行代码并从主要内容中打印data-original-title属性,您可以看到我们位于dashborad页面:

In [5]: with requests.Session() as s:
   ...:         PASSWORD = 'xxxxxx'
   ...:         USERNAME = "xxxxxxxxxx"
   ...:         URL = 'https://www.onlydomains.com/account/login'
   ...:         soup = BeautifulSoup(c.get(URL).text, "lxml")
   ...:         csrf = soup.select_one("input[name=csrfToken]")["value"]
   ...:         login_data = {
   ...:         'csrfToken' : csrf,
   ...:         'username' : USERNAME,
   ...:         'password' : PASSWORD}
   ...:         r = s.post(URL, data=login_data, headers={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'})
   ...:         url = patt.search(r.text).group(1)
   ...:         r = s.get(url).text
   ...:         soup = BeautifulSoup(r,"lxml")
   ...:         print(soup.select_one("h1.PageTitle.visible-xs i.fa.fa-info-circle")["data-original-title"])
   ...:     

Welcome to your Dashboard! Here you have a general overview of what's happening and how to manage your domain assets.

我认为解决问题的最佳方法是使用selenium(我记得做类似于你想用BS做什么,但我不记得现在怎么办)

from selenium import webdriver

chromedriver = 'C:\\chromedriver.exe'
browser = webdriver.Chrome(chromedriver)
browser.get('http://www.example.com')

username = browser.find_element_by_name('username')
username.send_keys('user1')

password = browser.find_element_by_name('password')
password.send_keys('secret')

form = browser.find_element_by_id('loginForm')
form.submit()

这将使您能够加载应包含您想要的信息的下一页:)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM