简体   繁体   中英

How to scrape content from other sites using jQuery?

I am doing a NEWS site project in PHP and for this project I want to fetch content from other NEWS sites using jQuery/JavaScript. Is there any functions in jQuery which scrape content from other domain names?

And also I don't want to use huge server CPU, since it is a college server. Does using jQuery for scrapping content use huge CPU?

In Stack Overflow I read about jQuery.get() function, is it ok to use this function to scrape content from other sites?

Using Cross-Domain-Ajax JQuery Plugin you can do it like this:

$.ajax({
    url: 'http://news.bbc.co.uk',
    type: 'GET',
    success: function(res) {
        var headline = $(res.responseText).find('a.tsh').text();
        alert(headline);
    }
});

they're hijacking the ajax method to use YQL to grab the html and return it as JSON, then use that as a string to scrape the data. check out the Jquery Cross-domain Ajax Guide for more info.

You can't. The Same Origin Policy prevents this. To do this you need to do it on a server using XMLHTTP.

您可以使用PHP中的CURL而不是jquery进行数据抓取您可以在PHP中看到使用CURL进行数据抓取的博客: http//www.codefire.org/blogs/item/data-scraping-using-curl-in-php html的

I suggest you use the curl module in PHP to access the news site's rss feed to collect the news you want to embed.

Setup a cron process to periodically download the RSS feed to local storage and convert it into a format you can use for your site. This will help to keep the load on the server down as your gathering the news once instead of every time the page is accessed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM