I have a python requests/beatiful soup code below which enables me to login to a url successfully. However, after logon, to get the data I need would normally have to manually have to:
1) click on 'statement' in the first row:
2) Select dates, click 'run statement':
3) view data:
This is the code that I have used to logon to get to step 1 above:
import requests
from bs4 import BeautifulSoup
logurl = "https://login.flash.co.za/apex/f?p=pwfone:login"
posturl = 'https://login.flash.co.za/apex/wwv_flow.accept'
with requests.Session() as s:
s.headers = {"User-Agent":"Mozilla/5.0"}
res = s.get(logurl)
soup = BeautifulSoup(res.text,"html.parser")
arg_names =[]
for name in soup.select("[name='p_arg_names']"):
arg_names.append(name['value'])
values = {
'p_flow_id': soup.select_one("[name='p_flow_id']")['value'],
'p_flow_step_id': soup.select_one("[name='p_flow_step_id']")['value'],
'p_instance': soup.select_one("[name='p_instance']")['value'],
'p_page_submission_id': soup.select_one("[name='p_page_submission_id']")['value'],
'p_request': 'LOGIN',
'p_t01': 'solar',
'p_arg_names': arg_names,
'p_t02': 'password',
'p_md5_checksum': soup.select_one("[name='p_md5_checksum']")['value'],
'p_page_checksum': soup.select_one("[name='p_page_checksum']")['value']
}
s.headers.update({'Referer': logurl})
r = s.post(posturl, data=values)
print (r.content)
My question is, (beginner speaking), how could I skip steps 1 and 2 and simply do another headers update and post using the final URL using selected dates as form entries (headers and form info below)? (The referral header
is step 2 above):
Edit 1: network request from csv file download:
使用Selenium WebDriver,它具有很多很好的功能来处理Web服务。
Selenium is gonna be your best bet for automated browser interactions. It can be used not only to scrape data from websites but also to interact with different forms and such. I highly recommended it as I have used it quite a bit in the past. If you already have pip and python installed go ahead and type
pip install selenium
That will install selenium but you also need to install either geckodriver (for Firefox) or chromedriver (for chrome) Then you should be up and running!
As others have recommended, Selenium is a good tool for this sort of task. However, I'd try to suggest a way to use requests
for this purpose as that's what you asked for in the question.
The success of this approach would really depend on how the webpage is built and how data files are made available (if "Save as CSV" in the view data is what you're targeting).
If the login mechanism is cookie-based, you can use Sessions and Cookies in requests. When you submit a login form, a cookie is returned in the response headers. You add the cookie to request headers in any subsequent page requests to make your login stick.
Also, you should inspect the network request for "Save as CSV" action in the Developer Tools network pane. If you can see a structure to the request, you may be able to make a direct request within your authenticated session, and use a statement identifier and dates as the payload to get your results.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.