How to get parent items with BeautifulSoup4?

Question

I'm getting soup using this:

soup = BeautifulSoup(html, 'lxml').find("tbody").find_all("tr")

And then soup object contains multiple similar tr> objects just like this:

<tr>
<td class="table">115</td>
<td>204</td>
<td><div><span class="flag-icon"></span>  United States <span> NY </span></div></td>
<td>brown</td>
<td>up</td>
<td class="table">groove</td>
</tr>

So, my goal is - to get stripped text data from just 1,2,4 and put them together to small list. Like this:

[115, 204, 'brown']

And after I got all the small lists from all , I have to add all of them to big list. Like this:

[[115, 204, 'brown'], [32, 12, 'red'] ... [42, 87, 'yellow']]

To be honest, I did it using two for loops, and sliced needed small list items to append a big list. But I assume, there is much better and simplier way to do that.

Maybe you have some ideas how to use powerful abilities of BeautifulSoup in my case?

Answer 1

Try the following:-

rows = BeautifulSoup(html, 'lxml').find("tbody").find_all("tr")
bigList = []
for row in rows:
    tds = row.find_all("td")
    bigList.append([tds[0].text, tds[1].text, tds[3].text])

How to get parent items with BeautifulSoup4?

Question

1 answers

solution1
1 ACCPTED 2018-05-31 12:13:47

How to get parent items with BeautifulSoup4?

Question

1 answers

solution1 1 ACCPTED 2018-05-31 12:13:47

solution1
1 ACCPTED 2018-05-31 12:13:47