简体   繁体   中英

extracting tags as well as text PHP Simple HTML DOM Parser Manual

Hi I am using the following to extract the li content, everything works fine but I would like to also include the and tags as well as the text. what do i need to write instead of "$element2->plaintext"???

 // Create DOM from URL or file
    $html = file_get_html('2002-10-01.html');

    // Find all images 
    foreach($html->find('table tr[bgcolor=#FFCCCC]') as $element) {
           foreach($element->find('li') as $element2) { 

///just plain text
            echo $element2->plaintext . '<br>'; 

///plain text and html elements

echo $element2->html. '<br>';

           }

           }

here is the html i am extracting

<tr bgcolor="#FFCCCC"> <!-- HEADLINE TEXT --> 
                      <td class="blue_body"> 
                        <ul>
                          <li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">As 
                            <b>Bertelsmann</b> continues to haggle with Clive 
                            Calder over how much it must pay to buy his Zomba 
                            independent record company, it plans to consolidate 
                            <b>Zomba</b> under its RCA label. <a href="http://www.nypost.com/business/58425.htm">NYPost</a> 
                            </font>
                          <li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">News 
                            Corporation and Telecom Italia are expected to announce 
                            a deal today to acquire the Italian satellite television 
                            operation of <b>Vivendi Universal</b> for $464m in 
                            cash. <a href="http://www.nytimes.com/2002/10/01/business/media/01RUPE.html">NYTimes</a> 
                            </font>
                          <li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">Two 
                            years after <b>America Online</b> agreed to acquire 
                            Time Warner, Ted Turner has soured on both the merger 
                            and Stephen Case, its principal architect. <a href="http://nytimes.com/2002/10/01/technology/01AOL.html">NYTimes</a> 
                            </font> 
                        </ul>
                      </td>
                    </tr>

找到了答案

echo $element2->outertext . '<br>';  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM