簡體   English   中英

Python 3.6:美麗的湯-如何提取div容器中的所有文本?

[英]Python 3.6: Beautiful Soup - How to extract all the text in a div container?

[<div class="nav-wrapper">
<p class="navigation-links">
<span class="page-numbers current">1</span>
<a class="page-numbers" href="http://www.example.com/2/">2</a>
<a class="page-numbers" href="http://www.example.com/3/">3</a>
<span class="page-numbers dots">…</span>
<a class="page-numbers" href="http://www.example.com/6/">6</a>
<a class="next page-numbers" href="http://www.example.com/2/">Next →</a> </p>
</div>]

另外,是否有一種簡單的方法可以提取頁面導航欄中的最大頁數,假設“ span class”之后的條目是上限。

html = '''<div class="nav-wrapper">
          <p class="navigation-links">
          <span class="page-numbers current">1</span>
          <a class="page-numbers" href="http://www.example.com/2/">2</a>
          <a class="page-numbers" href="http://www.example.com/3/">3</a>
          <span class="page-numbers dots">…</span>
          <a class="page-numbers" href="http://www.example.com/6/">6</a>
          <a class="next page-numbers" href="http://www.example.com/2/">Next →</a> </p>
          </div>'''
bs = BeautifulSoup(html, "html.parser")
max_page = bs.find('span', {'class':'page-numbers dots'}).findNext().text

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM