简体   繁体   English

simplehtmldom-限制get_html的内容大小?

[英]Simplehtmldom - limit content size for get_html?

I'm using simplehtmldom to get the title of some links and wondering if I can limit the size of the downloaded content? 我正在使用simplehtmldom获取某些链接的标题,并且想知道是否可以限制下载内容的大小? Instead of downloading the whole content just the first 20 lines of code to get the title. 无需下载整个内容,只需下载前20行代码即可获得标题。

Right now I'm using this: 现在我正在使用这个:

  $html = file_get_html($row['current_url']);

  $e = $html->find('title', 0);
  $title = $e->innertext;
  echo $e->innertext . '<br><br>';

thanks 谢谢

Unless I've missed something, that's not the way file_get_html works. 除非我错过了一些东西,否则file_get_html不会那样工作。 It's going to retrieve the contents of the page. 它将检索页面的内容。

In other words, it would have to read the entire page in order to find what it's looking for in the next part. 换句话说,它必须阅读整个页面才能在下一部分中找到所需内容。

Now, if you were to use: 现在,如果您要使用:

$section = file_get_contents('http://www.the-URL.com/', NULL, NULL, 0, 444);

You could probably isolate the first 20 lines of html, so long as the page you are getting is always the same from the <!DOCTYPE html> to the </head><body> or <title></title> . 只要您获得的页面从<!DOCTYPE html></head><body><title></title>始终相同,就可以隔离html的前20行。

Then you could grab the first 20 lines, or so, again as long as the amount of Head is the same. 然后,只要Head的数量相同,就可以抓住前20行左右。

Then use: 然后使用:

$html = str_get_html($section);

And then from there use your 'Find' 然后从那里使用“查找”

$html->find('title', 0);


EDIT: 编辑:

 include('simple_html_dom.php'); $the_url = 'http://www.the-URL.com/'; // Read 444 characters starting from the 1st character $section = file_get_contents($the_url, NULL, NULL, 0, 444); $html = str_get_html($section); if (!$e = $html->find('title', 0)) { // Read 444 characters starting from the 445th character $section = file_get_contents($the_url, NULL, NULL, 444, 888); $html = str_get_html($section); $e = $html->find('title', 0); } $title = $e->innertext; echo $title . '<br><br>'; 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM