简体   繁体   中英

How do I get an item but only if is a sibiling of a certain tag

I have a long html but here's a fragment:

<tr>
    <td data-bind="text:name, css: isActive() ? 'variable-active': 'variable-inactive'" class="variable-active">Vehicle</td>
    <td data-bind="text:value">Ford</td>
</tr>

<tr>
    <td data-bind="text:name, css: isActive() ? 'variable-active': 'variable-inactive'" class="variable-inactive">Model</td>
    <td data-bind="text:value">Focus</td>
</tr>

I want to get all the content tags based on if it is "variable-active", and then get the value from the next 'td' tag. In this case, as the second class tag is "variable-inactive", the output should be:

"Vehicle - Ford"

I managed to get the first tags based on the "variable-active" but I can't get the second values from the other tags. This is my code:

from bs4 import BeautifulSoup

with open ("html.html","r") as f:

doc = BeautifulSoup(f,"html.parser")

tag = doc.findAll("tr")[0]

print(tag.findAll(class_="variable-active")[0].contents[0]) #vehicle

tag.findNextSibling(class_="variable-active") # nothing

You want to structure your search a little bit different:

tag = soup.findAll("tr")[0]

tag1 = tag.find(class_="variable-active")  # <-- use .find
tag2 = tag1.findNextSibling()              # <-- use tag1.findNextSibling() to find next sibling tag

print(tag1.text)                           # <-- use .text to get all text from tag
print(tag2.text)

Prints:

Vehicle
Ford

Another version using CSS selectors:

data = soup.select(".variable-active, .variable-active + *")
print(" - ".join(d.text for d in data))

Prints:

Vehicle - Ford

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM