简体   繁体   English

如何在节点上同步处理从前端接收的文件 API

[英]How to handle files received from the frontend synchronously API on node

At first, I apologize for my terrible English :D起初,我为我糟糕的英语道歉:D

Hello, I have the following situation that is leaving me intrigued, I have a frontend made in react and a backend in node that receives requests by express.您好,我有以下情况让我很感兴趣,我在 react 中有一个前端,在 node 中有一个后端,它通过 express 接收请求。 The idea is that from the frontend I send a pdf file using a POST method and the backend processes this file making the necessary treatments (separating pages into more files and taking data inside the pdf) and at the end of it it returns these treated pdf's.这个想法是,从前端我使用 POST 方法发送一个 pdf 文件,后端处理这个文件进行必要的处理(将页面分成更多文件并在 pdf 中获取数据),并在它结束时返回这些处理过的 pdf . I wanted to return these new files in the POST response, but I am having a problem with asynchronism.我想在 POST 响应中返回这些新文件,但我遇到了异步问题。 When processing the file, I use the pfd2Json library, and this library apparently processes the pdf you choose asynchronously and lets the execution flow continue, my problem is that when I send the library to process my pdf, it puts it in the "background" "and the executions continue, this leads to the end of the flow and send the post's response before the library handles the pdf's.处理文件时,我使用 pfd2Json 库,该库显然是异步处理您选择的 pdf 并让执行流程继续,我的问题是,当我发送库来处理我的 pdf 时,它把它放在“后台” “并且继续执行,这会导致流程结束并在库处理 pdf 之前发送帖子的响应。

When a post request arrives, the program executes this function " getPDF() "当 post 请求到达时,程序执行这个函数“ getPDF()

async function getPdf(fileLocation) {
    let pdf = fileLocation;

    await pdfSeparator(pdf, folderTemp);

    await getInformationsPdf();
    return arrayObj
}

When it executes getInformationsPdf() , the program executes everything, but does not wait for the pdf to be processed by the library.当它执行getInformationsPdf() 时,程序会执行所有内容,但不会等待库处理 pdf。 In this case, I load each separate file into a forEach, use pdfParser.loadPDF(fileLocation);在这种情况下,我将每个单独的文件加载到 forEach 中,使用pdfParser.loadPDF(fileLocation); to upload my pdf and he is waiting for everything to be read by pdfParser.on ("pdfParser_dataReady", pdfData => {}) only because this method is asynchronous, it just calls and puts it in the background, making the flow continue until the end of the block and go to the next forEach item, while the pdf hasn't even been processed yet.上传我的pdf,他正在等待pdfParser.on(“pdfParser_dataReady”,pdfData => {})读取所有内容,只是因为此方法是异步的,它只是调用并将其放在后台,使流程继续,直到块的末尾并转到下一个 forEach 项目,而 pdf 甚至还没有被处理。 In the end, all forEach has already been executed and the pdfs have not yet been processed, the program sends the response and the data from the pdf's are on the backend.最后,所有的 forEach 都已经被执行,pdfs 还没有被处理,程序发送响应并且来自 pdf 的数据在后端。 Is there a way for me to force the wait for treatment before sending the response?有没有办法让我在发送响应之前强制等待治疗?

async function getInformationsPdf() {
    let arrayObjs = []
    fs.readdirSync(folderTemp).forEach(file => {
        var pdfParser = new PDFParser(this, 1);
        let fileLocation = folderTemp + file;
        pdfParser.loadPDF(fileLocation);
        
        pdfParser.on("pdfParser_dataError", (errData) => {
            console.error(errData.parserError)
        });

        pdfParser.on("pdfParser_dataReady", (pdfData) => {
            let t1 = pdfData.formImage.Pages[0].Texts[32].R[0].T.replace(/%20/g, " ");
            let t2 = pdfData.formImage.Pages[0].Texts[33].R[0].T.replace(/%20/g, " ");
            let t3 = pdfData.formImage.Pages[0].Texts[34].R[0].T.replace(/%20/g, " ");
            let t4 = pdfData.formImage.Pages[0].Texts[35].R[0].T.replace(/%20/g, " ");
            let t5 = pdfData.formImage.Pages[0].Texts[36].R[0].T.replace(/%20/g, " ");
            let t6 = pdfData.formImage.Pages[0].Texts[37].R[0].T.replace(/%20/g, " ");
            let t7 = pdfData.formImage.Pages[0].Texts[38].R[0].T.replace(/%20/g, " ");
            let t8 = pdfData.formImage.Pages[0].Texts[39].R[0].T.replace(/%20/g, " ");
            let t9 = pdfData.formImage.Pages[0].Texts[40].R[0].T.replace(/%20/g, " ");
            let textsPdf = [t1, t2, t3, t4, t5, t6, t7, t8, t9];
            let fileWithTexts = {
                file: fileLocation,
                texts: textsPdf
            }

            renameFileMatch(fileWithTexts);
            arrayObjs.push(fileWithTexts);
        });
    })
    return arrayObjs;
}

If I've understood the question correctly, getInformationsPdf() goes through a loop for each file in that folder, and it doesn't wait for the processing inside the pdfParser.on("pdfParser_dataReady" to finish before going on, so in this bit of code:如果我正确理解了这个问题, getInformationsPdf()会对该文件夹中的每个文件进行循环,并且它不会等待pdfParser.on("pdfParser_dataReady"内部的处理完成,然后再继续,所以在这个一点代码:

    let pdf = fileLocation;

    await pdfSeparator(pdf, folderTemp);

    await getInformationsPdf();
    return arrayObj

it runs return arrayObj before the pdfs are actually finished processing, right?它在 pdf 实际完成处理之前运行return arrayObj ,对吗?

So the pattern I think you should use is to make an array of Promises with fs.readdirSync(folderTemp).map , and then make the promise resolve at the end of pdfParser.on("pdfParser_dataReady" . Then, you can await Promise.all() all the promises所以我认为你应该使用的模式是用fs.readdirSync(folderTemp).map制作一个 Promises 数组,然后在pdfParser.on("pdfParser_dataReady"结束时使pdfParser.on("pdfParser_dataReady" 。然后,你可以await Promise.all()所有的承诺

It might look somethin glike this:它可能看起来像这样:

async function getInformationsPdf() {
    let arrayObjs = []
    const promises = fs.readdirSync(folderTemp).map(file => {
        return new Promise((resolve, reject) => {
            var pdfParser = new PDFParser(this, 1);
            let fileLocation = folderTemp + file;
            pdfParser.loadPDF(fileLocation);

            pdfParser.on("pdfParser_dataError", (errData) => {
                console.error(errData.parserError);
                reject(errData);
            });

            pdfParser.on("pdfParser_dataReady", (pdfData) => {
                let t1 = pdfData.formImage.Pages[0].Texts[32].R[0].T.replace(/%20/g, " ");
                let t2 = pdfData.formImage.Pages[0].Texts[33].R[0].T.replace(/%20/g, " ");
                let t3 = pdfData.formImage.Pages[0].Texts[34].R[0].T.replace(/%20/g, " ");
                let t4 = pdfData.formImage.Pages[0].Texts[35].R[0].T.replace(/%20/g, " ");
                let t5 = pdfData.formImage.Pages[0].Texts[36].R[0].T.replace(/%20/g, " ");
                let t6 = pdfData.formImage.Pages[0].Texts[37].R[0].T.replace(/%20/g, " ");
                let t7 = pdfData.formImage.Pages[0].Texts[38].R[0].T.replace(/%20/g, " ");
                let t8 = pdfData.formImage.Pages[0].Texts[39].R[0].T.replace(/%20/g, " ");
                let t9 = pdfData.formImage.Pages[0].Texts[40].R[0].T.replace(/%20/g, " ");
                let textsPdf = [t1, t2, t3, t4, t5, t6, t7, t8, t9];
                let fileWithTexts = {
                    file: fileLocation,
                    texts: textsPdf
                }

                renameFileMatch(fileWithTexts);
                arrayObjs.push(fileWithTexts);
                resolve(fileWithTexts);
            });
        })        
    });
    await Promise.all(promises);
    return arrayObjs;
}

You can use Promise in your getInformationsPdf() function您可以在getInformationsPdf()函数中使用 Promise

Example例子

function getInformationsPdf() {
       return new Promise((resolve, reject) => {
           let arrayObjs = [];
           fs.readdirSync(folderTemp).forEach(file => {
                  ...//your code stuff
                if(!fileWithTexts) resolve(arrayObjs);  //handle your exception
                renameFileMatch(fileWithTexts);
                arrayObjs.push(fileWithTexts);
                
           })
     })
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM