简体   繁体   English

使用Node JS Cheerio模块进行抓取时出现服务器问题?

[英]Server-sided issues when scraping with Node JS Cheerio module?

I am trying to follow this thread here: How can one parse HTML server-side with Meteor? 我试图在这里遵循这个主题: 如何用Meteor解析HTML服务器端?

Unfortunately I get the following errors when doing so: 不幸的是,这样做会出现以下错误:

Uncaught Error: Can't make a blocking HTTP call from the client; callback required. 

Here is the javascript code for my project: 这是我项目的javascript代码:

var cheerio;

if (Meteor.isClient) {

  Template.entry.events = {
    'click .btn_scrape' : function() {
    $ = cheerio.load(Meteor.http.get("https://github.com/meteor/meteor").content);
    console.log($('.commit-title').text().trim());
    },
 }
}

if (Meteor.isServer) {
  Meteor.startup(function () {
    var require = __meteor_bootstrap__.require;
    cheerio = __meteor_bootstrap__.require('cheerio');
  });


}

if I put the code in Meteor.startup(function()... nothing happens, there is no error and nothing is logged to the console. 如果我把代码放在Meteor.startup(函数()......没有任何反应,没有错误,没有任何东西记录到控制台。

I'd like to be able to call a function when a button is clicked to get the content in a textbox and scrape it, but this I can do later once I get the code working. 我希望能够在单击按钮时调用函数来获取文本框中的内容并将其删除,但是我可以在稍后使代码工作时执行此操作。

Would anyone by chance know how to fix this? 谁有机会知道如何解决这个问题?

Thank you for your time, 感谢您的时间,

Jonathan. 乔纳森。

The server and client side are still segregated. 服务器端和客户端仍然是隔离的。 In that other post Meteor.call is used to relay the message to the server to do a request there and return the scrape result back to the client. 在其他帖子中, Meteor.call用于将消息中继到服务器以在那里执行请求并将Meteor.call结果返回给客户端。

The error you're getting is due to javascript being asynchronous on the browser side of things. 你得到的错误是由于javascript在浏览器方面是异步的。 More info about that here & here . 关于这里这里的更多信息。 You need to use a callback with client side code because it takes time to fetch data from the server. 您需要使用带有客户端代码的回调,因为从服务器获取数据需要时间。

Is this your intention to run the http request off the client? 这是您打算从客户端运行http请求吗? On the client there are issues such as the Access-Control-Allow-Origin. 在客户端上存在诸如Access-Control-Allow-Origin. . Which is why on that post a Meteor.call is done to the server to proxy the request through and return the data to the client. 这就是为什么在该帖子上向服务器完成Meteor.call以代理请求并将数据返回给客户端。

In your click handler you could use the code at How can one parse HTML server-side with Meteor? 在您的单击处理程序中,您可以使用以下代码: 如何使用Meteor解析HTML服务器端? with: 有:

Template.entry.events = {
 'click .btn_scrape' : function() {
    $('.btn_scrape').attr('disabled','disabled')
    Meteor.call("last_action",function(err,result){
        $('.btn_scrape').removeAttr('disabled')
        console.log(result);
    });
 }
}

In the Meteor.isServer section of your code you would still need the method last_action to proxy the data to your browser: 在代码的Meteor.isServer部分中,您仍然需要使用方法last_action将数据代理到您的浏览器:

var cheerio = __meteor_bootstrap__.require('cheerio');
Meteor.methods({
last_action: function() {
       $ = cheerio.load(Meteor.http.get("https://github.com/meteor/meteor").content);
       return $('.commit-title').text().trim()      
    }
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM