Beautifulsoup requests.post not scraping correctly

Question

I am attempting to scrape a site that requires a login:

login_url = 'https://www.spotrac.com/signin/'
data = {
        'email': 'a****@gmail.com',
        'password': '******'
}

with requests.Session() as s:
    response = requests.post(login_url , data)
    index_page= s.get('https://www.spotrac.com/nba/contracts/breakdown/2010/')
    soup = BeautifulSoup(index_page.text, 'html.parser')

This code will scrape the page, but only as if you hadn't logged in - ie there is none of the data being returned that you would expect with an accurate login.

Where am I going wrong here?

Answer 1

I think you are sending your username and password to the wrong URL. https://www.spotrac.com/signin is the page that shows the login fields, but https://www.spotrac.com/signin/submit/ is the page your credentials get sent to when you click submit.

I was not able to test this code because I don't want to pay $30

login_url = 'https://www.spotrac.com/signin/submit/'
data = {
        'email': 'a****@gmail.com',
        'password': '******'
}

with requests.session() as s:
    response = requests.post(login_url , data)
    index_page= s.get('https://www.spotrac.com/nba/contracts/breakdown/2010/')
    soup = BeautifulSoup(index_page.text, 'html.parser')

Beautifulsoup requests.post not scraping correctly

Question

1 answers

solution1
0 2020-03-15 23:34:24

Beautifulsoup requests.post not scraping correctly

Question

1 answers

solution1 0 2020-03-15 23:34:24

solution1
0 2020-03-15 23:34:24