简体   繁体   中英

Python web data parse

i've been trying to parse data from a website that requires login so i've been using this code below

import requests
from lxml import html
session_requests = requests.session()
payload = {
    "login-username": "myusername", 
    "login-password": "mypassword"
}
login_url = "https://oprewards.com/login"
result = session_requests.get(login_url)

tree = html.fromstring(result.text)
result = session_requests.post(
    login_url, 
    data = payload, 
    headers = dict(referer=login_url)
)
url = 'https://oprewards.com/profile'
result = session_requests.get(
    url, 
    headers = dict(referer = url)
)

print(result.content)

but it isn't working, I'm not so good at Python so i wish that I can get help there, thanks.

Thanks for asking this question.

One thing right off the bat is you'll want to check out where actually the login occurs. If you open the network tab, it doesn't send a request to the page that it shows the user, but a different url:

https://oprewards.com/ASEngine/ASAjax.php

I think once you investigate what urls you send your data to you can construct a more accurate request to log yourself in.

在此处输入图片说明

However, if you want to login exactly as a normal user would (that is, by entering in a user/password and clicking the "Login" button, I'd suggest using a browser-automation tool, such as Selenium Webdriver for python: https://selenium-python.readthedocs.io/getting-started.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM