简体   繁体   中英

Web Scraping using Requests - Python

I am trying to get data using the Resquest library, but I'm doing something wrong. My explanation, manual search:

URL - https://www9.sabesp.com.br/agenciavirtual/pages/template/siteexterno.iface?idFuncao=18

I fill in the “Informe o RGI” field and after clicking on the Prosseguir button (like Next):

enter image description here

I get this result:

enter image description here

Before I coding, I did the manual search and checked the Form Data:

enter image description here

And then I tried it with this code:

import requests

data = { "frmhome:rgi1": "0963489410"}

url = "https://www9.sabesp.com.br/agenciavirtual/block/send-receive-updates"
res = requests.post(url, data=data)

print(res.text)

My output is:

<session-expired/>

What am I doing wrong?

Many thanks.

When you go to the site using the browser, a session is created and stored in a cookie on your machine. When you make the POST request, the cookies are sent with the request. You receive an session-expired error because you're not sending any session data with your request.

Try this code. It requests the entry page first and stores the cookies. The cookies are then sent with the POST request.

import requests

session = requests.Session() # start session

# get entry page with cookies
response = session.get('https://www9.sabesp.com.br/agenciavirtual/pages/home/paginainicial.iface', timeout=30)
cks = session.cookies  # save cookies with Session data
print(session.cookies.get_dict())

data = { "frmhome:rgi1": "0963489410"}

url = "https://www9.sabesp.com.br/agenciavirtual/block/send-receive-updates"
res = requests.post(url, data=data, cookies=cks)  # send cookies with request

print(res.text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM