简体   繁体   中英

Parsing RSS - but following a link in it, to get images from the linked page

I'm learning about RSS parsing - in fact I'm very new to it. I wondered about the concept of taking in an RSS feed in php. But then using that RSS link for an item, to go to a new page, and then parse that page to find the images associated with it (assuming the image number varies, but the html on the follow up page doesn't)

Is such a thing possible? It's ok I'm not asking you for specifics/to do it for me. Just a bit of a brief education on the concept.

Say for example this site: http://www.bulettings.com/

It has an rss feed: http://www.bulettings.com/propertyrss

Say for each item:

<item>
<title>
House Let @ &#163;2,400 per month, Chatsworth Rd, Charminster, BH8
</title>
<link>
http://www.bulettings.com/property/chatsworth-rd-charminster-bh8/buni-000690/1
</link>
<description>
<a href="http://www.bulettings.com/property/chatsworth-rd-charminster-bh8/buni-000690/1"><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5lor.jpg" width="150" alt="" align="left" border="0" /></a>STUDENT HOUSE - Large modernised seven bedroom House in Charminster with partial double glazing. Also has garden off road parking and bike storage.
</description>
<guid isPermaLink="true">
http://www.bulettings.com/property/chatsworth-rd-charminster-bh8/buni-000690/1
</guid>
<pubDate>Mon, 04 Feb 2013 14:02:13 GMT</pubDate>
<enclosure length="29" url="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5lor.jpg" type="image/jpg"/>
</item>

I got that information, but then I said "go to " which takes me to a page of which source is:

<script type="text/javascript">
    if(document.images){ 
 currentphoto=1; 
 maxphotos=7; 
    photo = new Array(7); 
    Imagetext = new Array(7); 
photo[1]=new Image();
photo[1].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5lor.jpg";
Imagetext[1] ="Photo 1";
photo[2]=new Image();
photo[2].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5svv.jpg";
Imagetext[2] ="Photo 2";
photo[3]=new Image();
photo[3].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5j86.jpg";
Imagetext[3] ="Photo 3";
photo[4]=new Image();
photo[4].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5vc2.jpg";
Imagetext[4] ="Photo 4";
photo[5]=new Image();
photo[5].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3pd0xwvp9.jpg";
Imagetext[5] ="Photo 5";
photo[6]=new Image();
photo[6].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5gq9.jpg";
Imagetext[6] ="Photo 6";
photo[7]=new Image();
photo[7].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5qhr.jpg";
Imagetext[7] ="Photo 7";
photo[8]=new Image();
photo[8].src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3mr0yds1e.jpg";
Imagetext[8] ="Photo 8";

 }
</script>

Or further down the page:

<ul>

<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5lor.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 4" title="Photo 4"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5svv.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 7" title="Photo 7"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5j86.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 3" title="Photo 3"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5vc2.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 8" title="Photo 8"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3pd0xwvp9.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 2" title="Photo 2"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5gq9.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 2" title="Photo 2"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3nm0z5qhr.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 6" title="Photo 6"/></li>
<li><img src="http://www.estateagentslive.net/pchomesdata/BOURNEMOUTHUNI/PHOTOS/buni-000690-p-w-3mr0yds1e.jpg" width="1024" height="768" onmouseover="document.images['photoview'].src = this.src;" alt="Photo 1" title="Photo 1"/></li>
</ul>

So it'd get that info, and the info about it from the rss, put it somewhere and strip out all the rubbish and just leave it with the image links. Then move on to the next item.

Can that be done?

I would use use simplepie to break the feed into accessible pieces. It's extremely easy to use.

get_enclosure is probably what you'll want to focus on.

Here's a "getting started" example to show how easy it is to use.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM