简体   繁体   English

如何在 Google Cloud Function 的 /tmp 文件夹中下载文件,然后将其上传到 Google Cloud Storage

[英]How to download files in /tmp folder of Google Cloud Function and then upload it in Google Cloud Storage

So I need to deploy a Google Cloud Function that allow me to make 2 things.所以我需要部署一个谷歌云 Function 允许我做两件事。

The first one is to DOWNLOAD any files on SFTP/FTP server on /tmp local directory of the Cloud Function.第一个是在云 Function 的 /tmp 本地目录上的 SFTP/FTP 服务器上下载任何文件。 Then, the second step, is to UPLOAD this file in a bucket on the Google Cloud Storage.然后,第二步是将这个文件上传到 Google Cloud Storage 的存储桶中。

Actually I know how to upload but I don't get how to DOWNLOAD files from ftp server to my local /tmp directory.实际上我知道如何上传,但我不知道如何从 ftp 服务器下载文件到我的本地 /tmp 目录。

So, actually I have written a GCF that receive in parameters (on the body), the configuration (config) to allow me to connect on the FTP server, the filename and the path.所以,实际上我已经编写了一个 GCF,它接收参数(在主体上)、配置(config)以允许我在 FTP 服务器上连接、文件名和路径。

For my test I used the following ftp server test: https://www.sftp.net/public-online-sftp-servers with this configuration.对于我的测试,我使用了以下 ftp 服务器测试: https://www.sftp.net/public-online-sftp-servers与此配置。

{
    config:
    {
        hostname: 'test.rebex.net',
        username: 'demo',
        port: 22,
        password: 'password'
    },
    filename: 'FtpDownloader.png',
    path: '/pub/example'
}

After my DOWNLOAD, I start my UPLOAD.下载后,我开始上传。 For that I check if I found the DOWNLOAD file in '/tmp/filename' before to UPLOAD but the file is nerver here.为此,我检查是否在 UPLOAD 之前在 '/tmp/filename' 中找到了 DOWNLOAD 文件,但这里的文件比较紧张。

See the following code:请参阅以下代码:

exports.transferSFTP = (req, res) =>
{
    let body = req.body;
    if(body.config)
    {
        if(body.filename)
        {
            //DOWNLOAD
            const Client = require('ssh2-sftp-client');
            const fs = require('fs');

            const client = new Client();

            let remotePath
            if(body.path)
                remotePath = body.path + "/" + body.filename;
            else
                remotePath = "/" + body.filename;

            let dst = fs.createWriteStream('/tmp/' + body.filename);

            client.connect(body.config)
            .then(() => {
                console.log("Client is connected !");
                return client.get(remotePath, dst);
            })
            .catch(err => 
                {
                    res.status(500);
                    res.send(err.message);   
                })
           .finally(() => client.end());


           //UPLOAD
            const {Storage} = require('@google-cloud/storage');

            const storage = new Storage({projectId: 'my-project-id'});

            const bucket = storage.bucket('my-bucket-name');

            const file = bucket.file(body.filename);

            fs.stat('/tmp/' + body.filename,(err, stats) =>
            {
                if(stats.isDirectory())
                {
                    fs.createReadStream('/tmp/' + body.filename)
                        .pipe(file.createWriteStream())
                        .on('error', (err) => console.error(err))
                        .on('finish', () => console.log('The file upload is completed !!!'));

                    console.log("File exist in tmp directory");
                    res.status(200).send('Successfully executed !!!')
                }
                else
                {
                    console.log("File is not on the tmp Google directory");
                    res.status(500).send('File is not loaded in tmp Google directory')
                }
            });
        }
        else res.status(500).send('Error: no filename on the body (filename)');
    }
    else res.status(500).send('Error: no configuration elements on the body (config)');
}

So, I received the following message: "File is not loaded in tmp Google directory" because after fs.stat() method, stats.isDirectory() is false.因此,我收到以下消息:“文件未加载到 tmp Google 目录中”,因为在 fs.stat() 方法之后,stats.isDirectory() 为 false。 Before I use the fs.stats() method to check if the file is here, I have just writen files with the same filenames but without content.在我使用 fs.stats() 方法检查文件是否在这里之前,我刚刚编写了具有相同文件名但没有内容的文件。 So, I conclude that my upload work but without DONWLOAD files is really hard to copy it in the Google Cloud Storage.因此,我得出的结论是,我的上传工作,但没有 DONWLOAD 文件,很难将其复制到 Google 云存储中。

Thanks for your time and I hope I will find a solution.感谢您的时间,我希望我能找到解决方案。

The problem is that your not waiting for the download to be completed before your code which performs the upload starts running.问题是在执行上传的代码开始运行之前,您没有等待下载完成。 While you do have a catch() statement, that is not sufficient.虽然您确实有一个 catch() 语句,但这还不够。

Think of the first part (the download) as a separate block of code.将第一部分(下载)视为单独的代码块。 You have told Javascript to go off an do that block asynchronously.您已经告诉 Javascript 到 go 异步执行该块。 As soon as your script has done that, it immediately goes on to do the the rest of your script.一旦您的脚本完成此操作,它就会立即继续执行您的脚本的 rest。 It does not wait for the 'block' to complete.它不会等待“块”完成。 As a result, your code to do the upload is running before the download has been completed.因此,您的上传代码在下载完成之前运行。

There are two things you can do.你可以做两件事。 The first would be to move all the code which does the uploading into a 'then' block following the get() call (BTW, you could simplify things by using fastGet()).第一个是将执行上传的所有代码移动到 get() 调用之后的“then”块中(顺便说一句,您可以使用 fastGet() 来简化事情)。 eg例如

client.connect(body.config)
 .then(() => {
   console.log("Client is connected !");
   return client.fastGet(remotePath, localPath);
 })
 .then(() => {
    // do the upload
  }) 
  .catch(err => {
     res.status(500);
     res.send(err.message);   
  })
 .finally(() => client.end());

The other alternative would be to use async/await, which will make your code look a little more 'synchronous'.另一种选择是使用 async/await,这将使您的代码看起来更“同步”。 Something along the lines of (untested)类似于(未经测试)的东西

async function doTransfer(remotePath, localPath) {
  try {
    let client - new Client();
    await client.connect(config);
    await client.fastGet(remotePath, localPath);
    await client.end();
    uploadFile(localPath);
  } catch(err) {
    ....
   }
}

here is a github project that answers a similar issue to yours.是一个 github 项目,它回答了与您类似的问题。

here they deploy a Cloud Function to download the file from the FTP and upload them directly to the bucket, skipping the step of having the temporal file.在这里,他们部署了一个 Cloud Function 从 FTP 下载文件并将它们直接上传到存储桶,跳过了获取临时文件的步骤。

The code works, the deployment way in this github is not updated so I'll put the deploy steps as I suggest and i verified they work:该代码有效,此 github 中的部署方式未更新,因此我将按照我的建议放置部署步骤并验证它们是否有效:

  1. Activate Cloud Shell and run:激活 Cloud Shell 并运行:

  2. Clone the repository from github: git clone https://github.com/RealKinetic/ftp-bucket.git从 github 克隆存储库: git clone https://github.com/RealKinetic/ftp-bucket.git

  3. Change to the directory: cd ftp-bucket切换到目录: cd ftp-bucket

  4. Adapt your code as needed根据需要调整您的代码

  5. Create a GCS bucket, if you dont have one already you can create one by gsutil mb -p [PROJECT_ID] gs://[BUCKET_NAME]创建一个 GCS 存储桶,如果您还没有,可以通过gsutil mb -p [PROJECT_ID] gs://[BUCKET_NAME]创建一个

  6. Deploy: gcloud functions deploy importFTP --stage-bucket [BUCKET_NAME] --trigger-http --runtime nodejs8部署: gcloud functions deploy importFTP --stage-bucket [BUCKET_NAME] --trigger-http --runtime nodejs8

In my personal experience this is more efficient than having it in two functions unless you need to do some file editing within the same cloud function以我个人的经验,这比在两个功能中使用它更有效,除非您需要在同一云 function 中进行一些文件编辑

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM