简体   繁体   中英

Stripping Specific HTML & Content from a page with PHP for RSS

I am building a mobile version of my company website, and one thing we are in need of is an RSS feed.

I have the RSS pulling in fine with this code:

<?php 

    $url = 'http://www.someurl.com/rss/articles';
    $feed = simplexml_load_file($url, 'SimpleXMLIterator');
    $filtered = new LimitIterator($feed->channel->item, 0, 15);
    foreach ($filtered as $item) { ?>

    <li data-icon="false">
    <h2><a href="<?php echo $item->link; ?>"><?php echo $item->title; ?></a></h2>
    <p class="desc"><?php echo $item->description; ?></p>
    <br />
    <p class="category"><b><?php echo $item->category; ?></b></p>
    <a class="link" href="<?php echo $item->link; ?>">Read More</a>
    <br />
    <p class="pubDate"><?php echo $item->pubDate; ?></p>
    <br />
    </li>

 <?php } ?> 

What I would like to do is utilize either the fopen() or file_get_contents() to handle the clicking of the 'Read More' link and strip all of the contents of the incoming page except for the <article> tag.

I have searched Google the past day, and have not been successful in finding any tutorials on this subject.

EDIT:

I would like to load the stripped HTML contents into their own view within my framework.

SECOND EDIT:

I would just like to share how I solved this problem.

I modified my $item->link; to be passed through the URL as a variable:

<a href="article.php?rss_url=<?php echo $item->link; ?>">Read More</a>

On the article.php page, I collect the variable with a if() statement:

if (isset($_GET['rss_url']) && is_string($_GET['rss_url'])) {
    $url = $_GET['rss_url'];
  }

Then building on the suggestions of the comments below, I built a way to then collect the incoming URL and strip the necessary tags to then format for my mobile view:

<div id="article">
  <?php 
    $link = file_get_contents($url);
    $article = strip_tags($link, '<title><div><article><aside><footer><ul><li><img><h1><h2><span><p><a><blockquote><script>');
    echo $article;
  ?>
</div>

Hopefully this helps anyone else who may encounter this problem :)

I'm not sure if I understand it correctly but are you trying to output the contents on the current page whenever someone clicks the more link?

I would probably use Javascipt to do that, maybe jQuery's .load() function which loads html from another page and allows you to load only specific fragments of a page.. but if you need to use php I would look into Simple HTML DOM Parser

$html = file_get_html($yourUrl);
$article = $html->find('article', 0);  // Assuming you only have 1 article/page
echo $article;

The only way I can see is to set up your own separate script to route the links through.

So, instead of echo $item->link use

echo 'LinkProcessor.php?link='.$item->link

Then, setup a script called LinkProcessor.php and use file_get_contents on that page. You can then process the XML to only show the article tag and echo the results:

$article = file_get_contents($_GET['link']);
$xml = new SimpleXMLElement($article);
$articleXml = $xml->xpath('//article');
echo articleXml[0];

Note that the code is untested, but it should be OK.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM