简体   繁体   中英

How often does loadHTMLFile get called in a single PHP-script?

Since my english is bad, the title questions might be a bit irritating, but I will get specific:

At the beginning of my PHP-script I call a HTML-file to get content out of it with the help of XPath.

$url = "http://www....";

$html = new DOMDocument();
libxml_use_internal_errors(true);
$html->loadHTMLFile($url);
$xpath = new DOMXPath($html);
libxml_clear_errors();

The target website I'm getting content from has a lot of information I need, but I have to do several different XPath-queries (30 to be exactly).

$xpath_match = $xpath->query('...');

My first thought was everytime I use a XPath-query by loadHTMLFile it calls the target website individually - again and again. This would cause a lot of (unnecessary) traffic and would slow down my script heavily.

I googled a bit, read the documentation , but both seem to suggest that once I called the target website at the beginning of my script it is stored as long as the script runs, and every XPath-query just reads the stored content.

However, yesterday I got an error concerning loadHTMLFile , saying I called it too often in the last hour. Still, I only refreshed my website with the PHP-script around 10 times in that specific hour. That's not really often and I am targeting something like around 150 to 200 website calls per hour in the future.

Someone out there that can clarifiy the issue? And if the content is stored, what was the reason for the error? And is there a workaround then?

Load the HTML once, using file_get_contents() :

$code = file_get_contents('http://www.example.com/');
$html = new DOMDocument();
libxml_use_internal_errors(true);
$html->loadHTML($code);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM