简体   繁体   中英

How to identify if a <p> tag is inside a <table> tag using python beautifulsoup?

How to find which p tags are inside vs outside a table tag?

<p> word outside p tag inside table tag </p>

<table>
  <tbody>
    <tr>
      <td>
        <p> word inside p tag inside table tag </p>
     </td>
   </tr>
 </tbody>
</table>

You can use CSS selector:

from bs4 import BeautifulSoup

html_doc = """
<p> word outside p tag inside table tag </p>

<table>
  <tbody>
    <tr>
      <td>
        <p> word inside p tag inside table tag </p>
     </td>
   </tr>
 </tbody>
</table>
"""

soup = BeautifulSoup(html_doc, "html.parser")

for p in soup.select("table p"):
    print(p.text)

Prints:

 word inside p tag inside table tag 

Or using bs4 API:

for table in soup.find_all("table"):
    for p in table.find_all("p"):
        print(p.text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM