繁体   English   中英

使用 bs4 python 抓取带有登录详细信息的站点

[英]Scrape site with login details with bs4 python

我正在尝试使用 bs4 登录网站。 登录已完成,但当我尝试解析商店数据时,它显示我没有登录。

import requests
from bs4 import BeautifulSoup

with requests.session() as c: 
    
    link="https://www.tais-shoes.ru/wp-login.php" 
    initial=c.get(link) 

    login_data = {"log": "*****","pwd": "*****", 
              "rememberme": "forever", 
              "redirect_to": "https://www.tais-shoes.ru/my-account/", 
              "redirect_to_automatic": "1"
             }

    page_login = c.post('https://www.tais-shoes.ru/wp-login.php', data=login_data)
    
    print(page_login) 
    
    shop_url = "https://www.tais-shoes.ru/shop/"
    html = requests.get(shop_url)
    soup = BeautifulSoup(html.text, 'html.parser')

    print(soup)

您应该使用您创建的request.Session的实例,但在您的代码下方,您将使用requests.get创建一个新连接。

改变这个

    html = request.get(shop_url)

对此:

    html = c.get(shop_url)

完整代码:

import requests
from bs4 import BeautifulSoup

with requests.Session() as c: 
    
    link="https://www.tais-shoes.ru/wp-login.php" 
    initial=c.get(link) 

    login_data = {"log": "*****","pwd": "*****", 
              "rememberme": "forever", 
              "redirect_to": "https://www.tais-shoes.ru/my-account/", 
              "redirect_to_automatic": "1"
             }

    page_login = c.post('https://www.tais-shoes.ru/wp-login.php', data=login_data)
    
    print(page_login) 
    
    shop_url = "https://www.tais-shoes.ru/shop/"
    html = c.get(shop_url)
    soup = BeautifulSoup(html.text, 'html.parser')

    print(soup)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM