简体   繁体   中英

How to get parent items with BeautifulSoup4?

I'm getting soup using this:

soup = BeautifulSoup(html, 'lxml').find("tbody").find_all("tr")

And then soup object contains multiple similar tr> objects just like this:

<tr>
<td class="table">115</td>
<td>204</td>
<td><div><span class="flag-icon"></span>  United States <span> NY </span></div></td>
<td>brown</td>
<td>up</td>
<td class="table">groove</td>
</tr>

So, my goal is - to get stripped text data from just 1,2,4 and put them together to small list. Like this:

[115, 204, 'brown']

And after I got all the small lists from all , I have to add all of them to big list. Like this:

[[115, 204, 'brown'], [32, 12, 'red'] ... [42, 87, 'yellow']]

To be honest, I did it using two for loops, and sliced needed small list items to append a big list. But I assume, there is much better and simplier way to do that.

Maybe you have some ideas how to use powerful abilities of BeautifulSoup in my case?

Try the following:-

rows = BeautifulSoup(html, 'lxml').find("tbody").find_all("tr")
bigList = []
for row in rows:
    tds = row.find_all("td")
    bigList.append([tds[0].text, tds[1].text, tds[3].text])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM