简体   繁体   English

Beautiful Soup 为特定 div 找到孩子

[英]Beautiful Soup find children for particular div

I have am trying to parse a webpage that looks like this with Python->Beautiful Soup:我试图用 Python->Beautiful Soup 解析一个看起来像这样的网页:在此处输入图片说明

I am trying to extract the contents of the highlighted td div.我正在尝试提取突出显示的 td div 的内容。 Currently I can get all the divs by目前我可以通过

alltd = soup.findAll('td')

   
for td in alltd:
    print td

But I am trying to narrow the scope of that to search the tds in the class "tablebox" which still will probably return 30+ but is more managable a number than 300+.但我试图缩小范围以搜索“tablebox”类中的 tds,它仍然可能返回 30+,但比 300+ 更易于管理。

How can I extract the contents of the highlighted td in picture above?如何提取上图中突出显示的 td 的内容?

It is useful to know that whatever elements BeautifulSoup finds within one element still have the same type as that parent element - that is, various methods can be called.知道 BeautifulSoup 在一个元素中找到的任何元素仍然具有与该父元素相同的类型是很有用的——也就是说,可以调用各种方法。

So this is somewhat working code for your example:因此,对于您的示例来说,这是一些有效的代码:

soup = BeautifulSoup(html)
divTag = soup.find_all("div", {"class": "tablebox"})

for tag in divTag:
    tdTags = tag.find_all("td", {"class": "align-right"})
    for tag in tdTags:
        print tag.text

This will print all the text of all the td tags with the class of "align-right" that have a parent div with the class of "tablebox".这将打印所有具有“align-right”类的td标签的所有文本,这些标签的父div为“tablebox”类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM