简体   繁体   English

将使用PHP file_get_contents抓取的HTML内容显示为div中的纯文本

[英]Display HTML content scraped using PHP file_get_contents as plain text within a div

I read a number of forum posts about displaying html contents as plain text in a page but my situation is little different & hence putting a new question. 我阅读了许多论坛文章,内容涉及在页面中将html内容显示为纯文本,但是我的处境与众不同,因此提出了一个新问题。

I have two Divs in a page 我在页面中有两个Divs

1) Input div, where I will let a user insert a URL (say ebay.com as shown below) 1)输入div,让用户在其中插入URL(例如,如下所示的ebay.com)

<div id="inputs">
<h3>Inputs</h3>
    <form id="inputs" method="POST">
        <label for="urltoget">URL to Get: </label>
        <input type="text" name="urltoget" id="urltoget" size="50" value="www.ebay.com"><br><br>
        <input type="submit" name="geturl" value="Step1">
    </form>

2) Output Div, where I want to use PHP & file_get_contents to display the contents of the input URL. 2)输出Div,我要在其中使用PHP和file_get_contents显示输入URL的内容。 The catch is I want to display the output in plaintext & not full HTML within the output div. 问题是我想在输出div中以纯文本格式显示输出,而不是完整的HTML。

if($_SERVER['REQUEST_METHOD'] === 'POST'){
$base_url = $_POST['urltoget'];
$contents = file_get_contents($base_url);
print_r($contents);

I am getting the entire ebay page with HTML contents in the output div. 我正在输出div中获得带有HTML内容的整个ebay页面。

So far I have tried following: 到目前为止,我尝试了以下操作:

1) header('content-type: text/plain'); 1) header('content-type: text/plain'); in the PHP code renders the whole page as plain text as expected. 在PHP代码中,按预期将整个页面呈现为纯文本。 However I want only the contents of second output div as plain text & not the entire page. 但是我只希望第二个输出div的内容为纯文本而不是整个页面。

2) print_r(htmlentities($contents)); 2) print_r(htmlentities($contents)); or echo htmlspecialchars($contents); echo htmlspecialchars($contents); Inserting this in the PHP code does not display any content in the second output div. 将其插入PHP代码不会在第二个输出div中显示任何内容。 Neither does it throw any error. 它也不会引发任何错误。

3) var_dump($contents); 3) var_dump($contents); Does not work either, it displays following: 也不起作用,它显示以下内容:

string

huge blanks space to scroll down & display

<!DOCTYPE html>
<html>
<head>
<script type="text/javascript">var ue_t0=ue_t0||+ne'... (length=187558)

My question: how can I get the HTML content (including html tags) as plain text within the second div? 我的问题:如何在第二个div内以纯文本格式获取HTML内容(包括html标签)? Please help!! 请帮忙!!

================================================================ ================================================== ==============

The solution from Terrymorse did the trick Terrymorse的解决方案成功了

<?php
$rawHTML = '<html><h1>This is a Title</h1></html>';
$encodedHTML = str_replace('<','&lt;',$rawHTML);
?>

<html>
    <body>
        <h3>
            The Encoded HTML
        </h3>
        <div style="border: 1px solid gray; padding: 12px">
            <pre><?php echo $encodedHTML; ?></pre>
        </div>
    </body>
</html>

Thanks to @markb for the suggestion on var_dump. 感谢@markb对var_dump的建议。 The output looks lot cleaner 输出看起来更干净

You can prevent evaluation of HTML tags simply by converting all instances of < to &lt; 您只需将<所有实例转换为&lt;即可防止对HTML标记进行评估&lt; . Example: 例:

<?php
$rawHTML = '<html><h1>This is a Title</h1></html>';
$encodedHTML = str_replace('<','&lt;',$rawHTML);
?>

<html>
    <body>
        <h3>
            The Encoded HTML
        </h3>
        <div style="border: 1px solid gray; padding: 12px">
            <pre><?php echo $encodedHTML; ?></pre>
        </div>
    </body>
</html>

Alternatively, there's the <xmp> tag, but it is obsolete. 另外,还有<xmp>标记,但是已经过时了。 The Mozilla documentation on <xmp> recommends using <pre> and <code> in its place. 有关<xmp>Mozilla文档 ,建议使用<pre><code>代替。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM