[英]How can I read and download a file from a ReadableStream on an API call (Node.js)?
[英]How can I download and save a file using the Fetch API? (Node.js)
我有一个可能很大(100+ Mb)文件的 url,如何使用 fetch 将它保存在本地目录中?
我环顾四周,但似乎没有很多关于如何执行此操作的资源/教程。
谢谢!
使用 Fetch API,您可以编写一个可以从 URL 下载的函数,如下所示:
const downloadFile = (async (url, path) => {
const res = await fetch(url);
const fileStream = fs.createWriteStream(path);
await new Promise((resolve, reject) => {
res.body.pipe(fileStream);
res.body.on("error", reject);
fileStream.on("finish", resolve);
});
});
如果您想避免像在另一个非常好的答案中那样显式地创建 Promise,并且可以构建整个 100+ MB 文件的缓冲区,那么您可以做一些更简单的事情:
const fetch = require('node-fetch');
const {writeFile} = require('fs');
const {promisify} = require('util');
const writeFilePromise = promisify(writeFile);
function downloadFile(url, outputPath) {
return fetch(url)
.then(x => x.arrayBuffer())
.then(x => writeFilePromise(outputPath, Buffer.from(x)));
}
但另一个答案将更节省内存,因为它将接收到的数据流直接通过管道传输到文件中,而不会将所有数据流累积到缓冲区中。
const {createWriteStream} = require('fs'); const {pipeline} = require('stream'); const {promisify} = require('util'); const fetch = require('node-fetch'); const downloadFile = async (url, path) => promisify(pipeline)( (await fetch(url)).body, createWriteStream(path) );
如果您不需要处理 301/302 响应(当东西被移动时),您实际上可以使用 Node.js 本机库http
和/或https
在一行中完成。
您可以在node
shell 中运行此示例 oneliner。 它只是使用https
模块将一些源代码的 GNU zip 文件下载到您启动node
shell 的目录。 (您可以通过在已安装 Node.js 的操作系统的命令行中键入node
来启动node
shell)。
require('https').get("https://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));
如果您不需要/想要 HTTPS,请改用它:
require('http').get("http://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));
此处较旧的答案涉及node-fetch
,但由于Node.js v18.x
这可以在没有额外依赖项的情况下完成。
获取响应的主体是网络流。 可以使用Readable.fromWeb
将其转换为 Node fs
流,然后可以将其通过管道传输到由fs.createWriteStream
创建的写入流中。 如果需要,可以使用stream.finished
的 promise 版本将生成的流转换为Promise
。
const fs = require('fs');
const { Readable } = require('stream');
const { finished } = require('stream/promises');
const stream = fs.createWriteStream('output.txt');
const { body } = await fetch('https://example.com');
await finished(Readable.fromWeb(body).pipe(stream));
import { existsSync } from "fs";
import { mkdir, writeFile } from "fs/promises";
import { sep } from "path";
export const download = async (url: string, ...folders: string[]) => {
const fileName = url.split(sep).pop();
const path = ["./downloads", ...folders].join(sep);
if (!existsSync(path)) await mkdir(path);
const response = await fetch(url);
const blob = await response.blob();
const stream = blob.stream();
const filePath = path + sep + fileName;
await writeFile(filePath, stream);
return { path, fileName, filePath };
};
// call like that ↓
await download("file-url", "subfolder-1", "subfolder-2", ...)
我正在寻找一种相同的用法,想要获取一堆 api 端点并保存 json 对一些 static 文件的响应,所以我想出了创建自己的解决方案,
const fetch = require('node-fetch'),
fs = require('fs'),
VERSIOINS_FILE_PATH = './static/data/versions.json',
endpoints = [
{
name: 'example1',
type: 'exampleType1',
url: 'https://example.com/api/url/1',
filePath: './static/data/exampleResult1.json',
updateFrequency: 7 // days
},
{
name: 'example2',
type: 'exampleType1',
url: 'https://example.com/api/url/2',
filePath: './static/data/exampleResult2.json',
updateFrequency: 7
},
{
name: 'example3',
type: 'exampleType2',
url: 'https://example.com/api/url/3',
filePath: './static/data/exampleResult3.json',
updateFrequency: 30
},
{
name: 'example4',
type: 'exampleType2',
url: 'https://example.com/api/url/4',
filePath: './static/data/exampleResult4.json',
updateFrequency: 30
},
],
checkOrCreateFolder = () => {
var dir = './static/data/';
if (!fs.existsSync(dir)) {
fs.mkdirSync(dir);
}
},
syncStaticData = () => {
checkOrCreateFolder();
let fetchList = [],
versions = [];
endpoints.forEach(endpoint => {
if (requiresUpdate(endpoint)) {
console.log(`Updating ${endpoint.name} data... : `, endpoint.filePath);
fetchList.push(endpoint)
} else {
console.log(`Using cached ${endpoint.name} data... : `, endpoint.filePath);
let endpointVersion = JSON.parse(fs.readFileSync(endpoint.filePath, 'utf8')).lastUpdate;
versions.push({
name: endpoint.name + "Data",
version: endpointVersion
});
}
})
if (fetchList.length > 0) {
Promise.all(fetchList.map(endpoint => fetch(endpoint.url, { "method": "GET" })))
.then(responses => Promise.all(responses.map(response => response.json())))
.then(results => {
results.forEach((endpointData, index) => {
let endpoint = fetchList[index]
let processedData = processData(endpoint.type, endpointData.data)
let fileData = {
data: processedData,
lastUpdate: Date.now() // unix timestamp
}
versions.push({
name: endpoint.name + "Data",
version: fileData.lastUpdate
})
fs.writeFileSync(endpoint.filePath, JSON.stringify(fileData));
console.log('updated data: ', endpoint.filePath);
})
})
.catch(err => console.log(err));
}
fs.writeFileSync(VERSIOINS_FILE_PATH, JSON.stringify(versions));
console.log('updated versions: ', VERSIOINS_FILE_PATH);
},
recursiveRemoveKey = (object, keyname) => {
object.forEach((item) => {
if (item.items) { //items is the nesting key, if it exists, recurse , change as required
recursiveRemoveKey(item.items, keyname)
}
delete item[keyname];
})
},
processData = (type, data) => {
//any thing you want to do with the data before it is written to the file
let processedData = type === 'vehicle' ? processType1Data(data) : processType2Data(data);
return processedData;
},
processType1Data = data => {
let fetchedData = [...data]
recursiveRemoveKey(fetchedData, 'count')
return fetchedData
},
processType2Data = data => {
let fetchedData = [...data]
recursiveRemoveKey(fetchedData, 'keywords')
return fetchedData
},
requiresUpdate = endpoint => {
if (fs.existsSync(endpoint.filePath)) {
let fileData = JSON.parse(fs.readFileSync(endpoint.filePath));
let lastUpdate = fileData.lastUpdate;
let now = new Date();
let diff = now - lastUpdate;
let diffDays = Math.ceil(diff / (1000 * 60 * 60 * 24));
if (diffDays >= endpoint.updateFrequency) {
return true;
} else {
return false;
}
}
return true
};
syncStaticData();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.