简体   繁体   中英

Finding css selectors in a webpage using php

I need to find existence of some css selectors in webpages, for example if a webpage has a div with a ID like this: <div id='header'> Smile </div> then a php function should return true else false or if a webpage has a div with a class like this: <div class='header'> Smile </div> then the php function return the value true or false .
I do not have proper idea to do this so, I have tried something like this:

<?php    
include("parser.php"); //using simple html dom parser
$datamain = file_get_html('http://stackoverflow.com/questions/14343073/how-to-count-an-array-content-and-assign-number-position-with-php'); //get the content
$classHeader = $datamain->find('.header', 0); //check for div which has class .header
if(!empty($classHeader)){ //now delete the div which has .header class if it is not empty
    foreach ($datamain->find('.classHeader') as $cclass){
    $datamain = str_replace($cclass,"", $datamain);
    }
}
?>

But it output this error:
Fatal error: Call to a member function find() on a non-object in C:\\xampp\\htdocs\\kitten-girl\\serp.php on line 4
So, how to check existence of a css selectors and if exists, then do something with that?
Res: http://simplehtmldom.sourceforge.net

For scraping like that on an external page, I use cURL, strpos, and substr. Since you don't need the actual content of the page and are just checking it to see if something is on the page, you just need cURL and strpos. So if you're pulling from that URL, it might look like this:

<?php

function checkPage($url=''){
    if(!$url){
        return false;
    }
    $soap_do = curl_init(); 
   curl_setopt($soap_do, CURLOPT_URL, $url );   
   curl_setopt($soap_do, CURLOPT_CONNECTTIMEOUT, 15); 
   curl_setopt($soap_do, CURLOPT_TIMEOUT, 15); 
   curl_setopt($soap_do, CURLOPT_RETURNTRANSFER, true );
   $result = curl_exec($soap_do);
   $data = htmlentities($result);
   //check for <div id="header" or <div class="header" or <div id='header'> or <div class='header'>
   if(strpos($data,"&lt;div id=&quot;header&quot;"&gt;) || strpos($data,"&lt;div class=&quot;header&quot;&gt;") || 
   strpos($data,"&lt;div id=&lsquo;header&lsquo;"&gt;) || strpos($data,"&lt;div class=&lsquo;header&lsquo;&gt;")){
       return true;
   }

       return false;

}//end function

$url = "http://stackoverflow.com/questions/14343073/how-to-count-an-array-content-and-assign-number-position-with-php";

if(checkPage($url)){
    //do something on success
}else{
    //do something on failure
}

You've got your CSS selector syntax wrong. The correct syntax for finding an element with an id of "header" is "#header" . The correct syntax for finding an element with a class of "header" is ".header" (for finding a div , and only a div , with a class of "header", it's "div .header" ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM