简体   繁体   English

如何从网页上下载并显示前200或250个字符?

[英]How can I download and display first 200 or 250 characters from a web-page?

I have a list of urls for which I want to display first 200 or 250 characters. 我有一个要显示的前200个或250个字符的URL列表。 Can I do it using jquery or should I download them on the server side [using PHP] and store them in database? 我可以使用jquery还是应该将其下载到服务器端(使用PHP)并将其存储在数据库中?

I guess I will have to use fopen with limit of characters. 我想我将不得不使用fopen并限制字符数。 ** **

Edit 编辑

First 200 characters of "body" excluding tags. “ body”的前200个字符(不包括标签)。 Like summary 像总结

Reading your title, my first inclination is to use FOPEN, but there are a few things that came to mind... 阅读您的标题时,我的第一个倾向是使用FOPEN,但是我想到了一些事情...

1) Are there "new lines" in your target HTML code? 1)您的目标HTML代码中是否有“换行”? For example, if you look at the source code of google.com, the whole "page" is only 15 lines of code. 例如,如果您查看google.com的源代码,则整个“页面”只有15行代码。 Hence, that would not work. 因此,这是行不通的。

2) Do you need to take into account formatting? 2)您需要考虑格式吗? Something as simple as a font tag or a link could take up most (or all) of the 200 character limit. 诸如字体标签或链接之类的简单内容可能会占用200个字符的大部分(或全部)限制。

You may want to look into: 您可能需要调查:

strip_tags(..) strip_tags(..)

http://php.net/manual/en/function.strip-tags.php http://php.net/manual/zh/function.strip-tags.php

How I would do it... 我会怎么做...

FOpen the page and store to string then strip_tags(..) the string and substr(..) the string "buffer". F打开页面并存储到字符串,然后将string_tags(..)字符串和substr(..)字符串“ buffer”存储。

Hope this helps. 希望这可以帮助。

You could do it with the simple html dom parser . 您可以使用简单的html dom解析器来完成此操作 This is kind of slow, though. 不过,这有点慢。 So you might consider storing the page contents in a database if you are displaying many excerpts on one page. 因此,如果您在一页上显示许多摘录,则可以考虑将页面内容存储在数据库中。

<?php
include("simple_html_dom.php");

$html = file_get_html("http://www.stackoverflow.com");
echo substr(str_replace("  ", "", $html->plaintext), 0, 200);
?>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM