简体   繁体   English

beautifulsoup 没有完全解析页面

[英]beautifulsoup doesn't fully parse the page

    import requests
from bs4 import BeautifulSoup as bs

url1 = 'https://school.karelia.ru/auth/login'
url2 = 'https://school.karelia.ru/personal-area/#diary'

payload = {
    'login_login': 'КлочковМ',
    'login_password': 'КлочковМ7'
}

def getHW():
    with requests.session() as s:
        s.post(url1, data=payload)
        r = s.get(url2)
        soup = bs(r.content, 'html.parser')
        print(soup.find_all("div"))

getHW()

i am trying to parse a site, and this code just doesnt do it fully.我正在尝试解析一个站点,但这段代码并没有完全解析。 in the website's code, there are a lot more subclasses than the result i get from this code:在网站的代码中,有比我从这段代码得到的结果更多的子类:

<div class="right" id="main-region"></div>

for some reason, the class "right" just ends there, even though in the site it continues a lot more.出于某种原因,class“正确”就到此为止,即使在站点中它继续了很多。 why could this be?为什么会这样?

it is because you did soup.find_all("div") .这是因为你做了soup.find_all("div") the div ends there with </div> and you told BS to only look for divs, so BS stops there. div 以</div>结尾,你告诉 BS 只查找 div,所以 BS 就停在那里。 to actually search for classes see for example this answer要实际搜索课程,请参见例如此答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM