[英]How do i pipe a long string to via child_process.spawn() in Node.js?
I'm reading a pdf text from a s3 bucket using S3fs.readFile, and i would like to get the result, transform in string and immediately open a spawn child_process calling pdftotext, passing the string: 我正在使用S3fs.readFile从s3存储桶读取pdf文本,我想获取结果,转换为字符串,然后立即打开一个调用pdftotext的生成child_process,并传递字符串:
S3Fs.readFile('./my-pdf-in-s3-bucket', {encoding: 'binary'}, (error, result) => {
mychild = child_process.spawn('pdftotext', [
result.Body
]);
});
This is causing the spawn process break because the string is to long, and i don't want save the file in disk just to read it again. 这会导致生成过程中断,因为字符串太长了,我不想将文件保存在磁盘中只是为了再次读取它。
Is it possible? 可能吗?
Thanks! 谢谢!
pdftotext should allow reading from stdin and writing to stdout (at least it worked for me with v0.41.0), so you could do this instead: pdftotext应该允许从stdin读取并写入stdout(至少在v0.41.0中对我有用),因此您可以这样做:
S3Fs.readFile('./my-pdf-in-s3-bucket', (err, result) => {
if (err) throw err; // Handle better
var cp = child_process.spawn('pdftotext', [ '-', '-' ]);
cp.stdout.pipe(process.stdout);
cp.on('close', (code, signal) => {
console.log(`pdftotext finished with status ${code}`);
});
cp.stdin.end(result);
});
Or possibly better yet, you might be able to stream the file to the child process instead of buffering its entire contents in memory first: 也许更好,您也许可以将文件流式传输到子进程,而不是先将其全部内容缓存在内存中:
var cp = child_process.spawn('pdftotext', [ '-', '-' ]);
var rs = S3Fs.createReadStream('./my-pdf-in-s3-bucket');
rs.on('error', (err) => {
cp.kill();
});
cp.stdout.pipe(process.stdout);
cp.on('close', (code, signal) => {
console.log(`pdftotext finished with status ${code}`);
});
rs.pipe(cp.stdin);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.