I execute the following code to log in to the url that's assigned to loginUrl
. After authentication, I want to go to another webpage that has its url stored in portfolioUrl
. However, when I print(portfolioPage.content)
, it prints the webpage directly after log in but not portfolioPage
that I want. What's wrong with my code?
from bs4 import BeautifulSoup
import requests
# create session
session = requests.Session()
loginUrl='https://www.investopedia.com/auth/realms/investopedia/protocol/openid-connect/auth?client_id=inv-simulator&redirect_uri=https%3A%2F%2Fwww.investopedia.com%2Fauth%2Frealms%2Finvestopedia%2Fshopify-auth%2Finv-simulator%2Flogin%3F%26redirectUrl%3Dhttps%253A%252F%252Fwww.investopedia.com%252Fauth%252Frealms%252Finvestopedia%252Fprotocol%252Fopenid-connect%252Fauth%253Fresponse_type%253Dcode%2526approval_prompt%253Dauto%2526redirect_uri%253Dhttps%25253A%25252F%25252Fwww.investopedia.com%25252Fsimulator%25252Fhome.aspx%2526client_id%253Dinv-simulator-conf&state=7edda3b2-eb6a-441f-8589-b42b8b78accf&response_mode=fragment&response_type=code&scope=openid&nonce=cd558670-7ae3-4c14-8281-bc149d4987b3'
portfolioUrl = 'https://www.investopedia.com/simulator/trade/tradestock.aspx'
payload = {
'username': 'my email',
'password': 'my password'
}
authPage = session.get(loginUrl)
soup = BeautifulSoup(authPage.content, 'html.parser')
form = soup.find('form')
postUrl = form['action']
auth = session.post(postUrl, data=payload)
portfolioPage = session.get(portfolioUrl)
soup = BeautifulSoup(portfolioPage.content, 'html.parser')
print(portfolioPage.content)
I don't think you are posting your data correctly and not keeping your session open after you're logged in. Try this...
#using requests.Session() to close session automatically once done
with requests.Session() as login_request:
payload = {
'username': 'my email',
'password': 'my password'
}
login_request.post(loginUrl, data=payload)
#while logged in get the content of the portfolioUrl variable
source_code = login_request.get(portfolioUrl).content
#after this you can use soup to parse the source_code
soup = BeautifulSoup(source_code, 'html.parser')
print(soup) #to check if it's printing the logged in data
You can try that
import requests
from bs4 import BeautifulSoup
# create session
session = requests.Session()
url = 'https://investopedia.com/simulator/portfolio/'
payload = {
'username': 'your_email',
'password': 'your_password'
}
# get log in page
auth_page = session.get(url)
soup = BeautifulSoup(auth_page.content, 'html.parser')
# get form
form = soup.find('form')
# get post url
post_url = form['action']
# auth
session.post(post_url, data=payload)
# parse content
content_url = 'https://investopedia.com/simulator/trade/tradestock.aspx'
page = session.get(content_url)
page_soup = BeautifulSoup(page.content, 'html.parser')
# simulate page
sim_page = page_soup.find('div', {'class': 'sim-page'})
table = sim_page.find_all('table', {'class': 'table2'})[1]
rows = table.find_all('tr')
for row in rows:
print(row.find('th').text)
print(row.find('td').text)
print('----')
Value (USD)
$10,000.00
----
Buying Power
$10,000.00
----
Cash
$10,000.00
----
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.