I am trying to create a simple rss feed website.
I can get a few of rss feeds by just doing this:
let article = {
'title': item.title,
'image': item.image.url,
'link': item.link,
'description': item.description,
}
Title and link work for most of rss feeds, but image and description do not.
Since a lot of rss fees has image as html inside of description like this:
{ title: 'The Rio Olympics Are Where TV Finally Sees the Future',
description: '<div class="rss_thumbnail"><img src="http://www.wired.com/wp-content/uploads/2016/08/GettyImages-587338962-660x435.jpg" alt="The Rio Olympics Are Where TV Finally Sees the Future" /></div>Time was, watching the Olympics just meant turning on your TV. That\'s changed—and there\'s no going back. The post <a href="http://www.wired.com/2016/08/rio-olympics-tv-finally-sees-future/">The Rio Olympics Are Where TV Finally Sees the Future</a> appeared first on <a href="http://www.wired.com">WIRED</a>.',...
How can I get image's url from it?
EDIT:
http.get("http://www.wired.com/feed/"...
.on('readable', function() {
let stream = this;
let item;
while( item = stream.read()){
let article = {
'title': item.title,
'image': item.image.url,
'link': item.link,
'description': item.description,
}
news.push(article);
}
})
this is some of my codes, and basically I am trying to get image url from Wired rss.
If I user 'image': item.image.url, it does not work. So what should I change it to?
use xml2js for converting xml to json
var parseString = require('xml2js').parseString;
var xml = '<img title=\'A San Bernardino County Fire Department firefighter watches a helitanker make a water drop on a wildfire, seen from Cajon Boulevard in Devore, Calif., Thursday, Aug. 18, 2016. (David Pardo/The Daily Press via AP)\' height=\'259\' alt=\'APTOPIX California Wildfires\' width=\'460\' src=\'http://i.cbc.ca/1.3730399.1471835992!/cpImage/httpImage/image.jpg_gen/derivatives/16x9_460/aptopix-california-wildfires.jpg\' />';
parseString(xml, function (err, result) {
console.log(JSON.stringify(result, null, 4));
console.log(result["img"]["$"]["src"]);
});
One idea would be to use regular expressions. For ex:
var re = /(src=)(\\'htt.*\\')/g
var img_string = "your image tag string"
var match = re.exec(img_string)
var result = match[1]
You can use DOMDocument parser to get Image source.
$html = "<img title=\'A San Bernardino County Fire Department firefighter watches a helitanker make a water drop on a wildfire, seen from Cajon Boulevard in Devore, Calif., Thursday, Aug. 18, 2016. (David Pardo/The Daily Press via AP)\' height=\'259\' alt=\'APTOPIX California Wildfires\' width=\'460\' src=\'http://i.cbc.ca/1.3730399.1471835992!/cpImage/httpImage/image.jpg_gen/derivatives/16x9_460/aptopix-california-wildfires.jpg\' />";
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$src = $xpath->evaluate("string(//img/@src)"); # "/images/image.jpg"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.