[英]beautifulsoup doesn't fully parse the page
import requests
from bs4 import BeautifulSoup as bs
url1 = 'https://school.karelia.ru/auth/login'
url2 = 'https://school.karelia.ru/personal-area/#diary'
payload = {
'login_login': 'КлочковМ',
'login_password': 'КлочковМ7'
}
def getHW():
with requests.session() as s:
s.post(url1, data=payload)
r = s.get(url2)
soup = bs(r.content, 'html.parser')
print(soup.find_all("div"))
getHW()
i am trying to parse a site, and this code just doesnt do it fully.我正在尝试解析一个站点,但这段代码并没有完全解析。 in the website's code, there are a lot more subclasses than the result i get from this code:在网站的代码中,有比我从这段代码得到的结果更多的子类:
<div class="right" id="main-region"></div>
for some reason, the class "right" just ends there, even though in the site it continues a lot more.出于某种原因,class“正确”就到此为止,即使在站点中它继续了很多。 why could this be?为什么会这样?
it is because you did soup.find_all("div")
.这是因为你做了soup.find_all("div")
。 the div ends there with </div>
and you told BS to only look for divs, so BS stops there. div 以</div>
结尾,你告诉 BS 只查找 div,所以 BS 就停在那里。 to actually search for classes see for example this answer要实际搜索课程,请参见例如此答案
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.