简体   繁体   English

如何用request和cheerio在node.js中同步调用

[英]How to make synchronous calls in node.js with request and cheerio

I'm working on a node.js project in which i'm taking input from a text file and trying to generate a json file containing the output.我正在处理一个 node.js 项目,在该项目中我从一个文本文件中获取输入并尝试生成一个包含 output 的 json 文件。

My text file contains multiple categories of data which are represented by a global variable current_title .我的文本文件包含多个类别的数据,由全局变量current_title表示。

My sample text file:我的示例文本文件:

*category1
item1
item2
*category2
item3
item4

My code looks something lithis:我的代码看起来很简单:

const fs = require('fs');
const request = require('request');
const cheerio = require('cheerio');
.
.
var current_title = "";
for (let i = 0; i < lines.length; i++) {
  if(lines[i].startsWith('*')) {
    lines[i] = lines[i].slice(1,undefined);
    current_title = lines[i];
  }
  else {
    console.log(current_title);
    if(current_title.trim() == "category1")
      function1(lines[i]);
  }
}
console.log("the end.");

function function1(name) {
  request(search_url, (err, res, body) => { if (err) console.log(err); else parseBody(body); });
  function parseBody(body) {
    const $ = cheerio.load(body);
    // DOES SOME WEB SCRAPING
    function2(url);
  }
}
  
function function2(url) {
  request(url, (err, res, body) => { if (err) console.log(err); else parseBody(body); });
  function parseBody(body) {
    const $ = cheerio.load(body);
    // DOES SOME WEB SCRAPING
    console.log(current_title);
    // USES current_title TO INSERT DATA INTO A JSON FILE
  }
}

The problem is that the function calls work asynchronously and hence my code doesn't work as expected ie, the current_title is updated to 'category2' before the function2 tries to insert data of 'category1'.问题是 function 调用是异步工作的,因此我的代码没有按预期工作,即,在function2尝试插入“category1”的数据之前, current_title更新为“category2”。 So, my code returns an output which looks something like below:因此,我的代码返回一个 output,如下所示:

category1
category1
category2
category2
the end.
category2
category2

How do i solve this problem?我该如何解决这个问题? and, Is there a better way to do this?还有,有更好的方法吗?

Use async await with cheerio.async await与 cheerio 一起使用。

Also, it isn't entirely necessary, but for organization I would take the parseBody function outside of the other function, and don't define it twice.此外,这不是完全必要的,但为了组织,我会将parseBody function 放在另一个 function 之外,并且不要定义它两次。

It seems that function1 and function2 do about the same thing with different variables, so I removed function2 .似乎function1function2使用不同的变量做同样的事情,所以我删除了function2 Feel free to add it back in if you feel it is necessary.如果您觉得有必要,请随时将其添加回去。

Here is some optimized code:这是一些优化的代码:

const fs = require('fs');
const request = require('request');
const cheerio = require('cheerio');
...
...
var current_title = "";
for (let i = 0; i < lines.length; i++) {
  if(lines[i].startsWith('*')) {
    lines[i] = lines[i].slice(1,undefined);
    current_title = lines[i];
  }
  else {
    console.log(current_title);
    if(current_title.trim() == "category1")
      function1(lines[i]);
  }
}
console.log("the end.");

async function function1(name) {
  request(search_url, (err, res, body) => { 
     if (err) console.log(err); 
     else await parseBody(body); 
  });
}

async function parseBody(body) {
  const $ = await cheerio.load(body);

  // do whatever
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM