简体   繁体   中英

Find div with class and it's plain-text using PHP Simple HTML DOM Parser

i want to find class ft00 between Work Experience and EDUCATION AND TRAINING and extract class text which contains dates from the given html

<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">John@gmail.com</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>

so far i could get is to extract all data between Work Experience and EDUCATION AND TRAINING and it's working properly and the code is given below:-

$fexp = $html->find('p[plaintext^=Work Experience]');
$items = array();
 foreach ($fexp as $keye) {

    while ( $keye->nextSibling() ) {
        if ( $keye->nextSibling() == TRUE ) {

         $keye = $keye->nextSibling();
            $varce = $keye->plaintext;



        }
        if ( trim($varce) == "EDUCATION AND TRAINING" ){
            break;
        }
        //$test[] = $collection;
       $items[] = $varce;
        // echo $varce;

}
}
var_dump($items);

i am close but can't seem to find out the solution, any help would be appreciated thanks :-)

With DOMDocument and DOMXPath you could do it like the following, I've never used Simple HTML DOM Parser but I'm presuming it has XPath.

<?php
$dom = new DOMDocument();

$dom->loadHtml('
<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">John@gmail.com</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xpath = new DOMXPath($dom);

$result = [];
$matching  = false;
foreach ($xpath->query("//p[contains(@class, 'ft00') or contains(@class, 'ft02')]/text()") as $p) {
    if ($p->nodeValue === 'Work Experience' || $matching) {
        $result[] = $p->nodeValue;
        $matching = true;
    }
    if ($p->nodeValue === 'EDUCATION AND TRAINING') {
        break;
    }
}

print_r($result);

Result:

Array
(
    [0] => Work Experience
    [1] => 27 July 2017
    [2] => ABC Company
    [3] => 19 May 2018
    [4] => XYZ Company
    [5] => EDUCATION AND TRAINING
)

https://3v4l.org/0nvr4

Here is the proper working code:-

$test = array();
$matching  = false;
$collection = $html->find('p.ft00');
foreach ($collection as $tkey) {
    if ($tkey->plaintext == "WORK EXPERIENCE" || $matching ) {
        $test[] = $tkey->plaintext;
        $matching = true;
    }
    if ( $tkey->plaintext == "EDUCATION AND TRAINING") {
        break;
    }

    }
    var_dump($test);    

Output:-

Array
(
    [0] => Work Experience
    [1] => 27 July 2017
    [2] => 19 May 2018
    [3] => EDUCATION AND TRAINING
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM