[英]can't read .docx file after uploading (nodejs)
So I'm trying to upload and read a .docx file to an express server using express-fileupload
package. 因此,我正在尝试使用
express-fileupload
软件包将.docx文件上传并读取到Express服务器。 The upload part is working fine but somehow I'm not able to read the file as it prints unreadable gibberish text. 上传部分工作正常,但由于某种原因我无法读取该文件,因为它会打印不可读的乱码文本。 Following is the code:
以下是代码:
app.post('/upload', (req, res, next) => {
let file = req.files.file;
file.mv(`${__dirname}/public/${req.body.filename}`, function(err) {
if (err) {
return res.status(500).send(err);
}
fs.readFile(`${__dirname}/public/${req.body.filename}`, 'utf8', function (err,data) {
if (err) {
return console.log(err);
}
console.log(data) // prints broken text/gibberish;
});
res.json({data to be returned});
});
});
What I want is to be able to read the .docx file and do operations on the text inside it. 我想要的是能够读取.docx文件并对其中的文本进行操作。
docx file don't contain human-readable text. docx文件不包含人类可读的文本。 They are actually ZIP files containing many different XML files - but even the text content of the XML files won't be easy to work with.
它们实际上是包含许多不同XML文件的ZIP文件-但是,即使XML文件的文本内容也很难使用。
If you want to read or even modify text inside a docx file you need to find a library that can read/write the format. 如果要读取甚至修改docx文件中的文本,则需要找到一个可以读取/写入格式的库。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.