[英]How to automate Google Drive Docs OCR facility?
I have using Google Drive and its Open with Google Docs facility to convert them into OCR word file (.docx).我使用 Google Drive 及其 Open with Google Docs 工具将它们转换为 OCR word 文件 (.docx)。 Because the word file preserves the formatting also.
因为 word 文件也保留了格式。 I have many images and upload them to Drive and convert them into editable one by one because PDF conversion does not work.
我有很多图像并将它们上传到 Drive 并将它们一一转换为可编辑的,因为 PDF 转换不起作用。
In this time I want to wait patiently to finish one conversion process.这时候我想耐心等待完成一个转换过程。 After that I start the next conversion, it is time consuming.
之后我开始下一次转换,很费时间。
I used Google OCR API.我使用了 Google OCR API。 But it does not preserve the formatting such as bold, alignment, etc.
但它不保留粗体、对齐等格式。
So, is there any way to automate this process using REST API?那么,有没有办法使用 REST API 自动化这个过程?
UPDATE更新
The Right click context menu of an image in Google Drive Google Drive 中图像的右键单击上下文菜单
Google Docs in the context menu of "Open with" “打开方式”上下文菜单中的 Google 文档
After the conversion process the OCR(Auto language detected)转换过程后 OCR(检测到自动语言)
I tried the googleapis on GitHub and I selected the drive sample list.js code.我在 GitHub 上尝试了googleapis ,并选择了驱动器示例list.js代码。
My Code我的代码
'use strict';
const {google} = require('googleapis');
const sampleClient = require('../sampleclient');
const drive = google.drive({
version: 'v3',
auth: sampleClient.oAuth2Client,
});
async function runSample(query) {
const params = {pageSize: 3};
params.q = query;
const res = await drive.files.list(params);
console.log(res.data);
return res.data;
}
if (module === require.main) {
const scopes = ['https://www.googleapis.com/auth/drive.metadata.readonly'];
sampleClient
.authenticate(scopes)
.then(runSample)
.catch(console.error);
}
module.exports = {
runSample,
client: sampleClient.oAuth2Client,
};
How about this modification?这个改装怎么样?
From your sample script, it was found that you are using googleapis
.从您的示例脚本中,发现您正在使用
googleapis
。 So in this modification, I also used googleapis
.所以在这次修改中,我也使用了
googleapis
。 The image files in Drive are converted to Google Document with OCR by files.copy
method in Drive API. Drive 中的图像文件通过 Drive API 中的
files.copy
方法转换为带有 OCR 的 Google 文档。 The following modification supposes the following points.以下修改假设以下几点。
googleapis
in Node.js.googleapis
。drive
in your script can be also used for the files.copy
method.drive
也可用于files.copy
方法。 Before you run the script, please confirm the following points.在运行脚本之前,请确认以下几点。
files.copy
method, please include https://www.googleapis.com/auth/drive
to the scopes in if
statement in list.js
.files.copy
方法,请将https://www.googleapis.com/auth/drive
包含到list.js
中if
语句的范围中。 In this modification, runSample()
was modified.在此修改中,修改了
runSample()
。
function runSample()
{
// Please set the file(s) IDs of sample images in Google Drive.
const files = [
"### fileId1 ###",
"### fileId2 ###",
"### fileId3 ###", , ,
];
// takes each file and convert them to Google Docs format
files.forEach((id) =>
{
const params = {
fileId: id,
resource:
{
mimeType: 'application/vnd.google-apps.document',
parents: ['### folderId ###'], // If you want to put the converted files in a specific folder, please use this.
},
fields: 'id',
};
// Convert after processes here
// Here we copy the IDs
drive.files.copy(params, (err, res) =>
{
if (err)
{
console.error(err);
return;
}
console.log(res.data.id);
});
});
}
image/png
, image/jpeg
and image/tiff
.image/png
、 image/jpeg
和image/tiff
。const folderId = "### folderId ###"; // Please set the folder ID including the images.
drive.files.list(
{
pageSize: 1000,
q: `'${folderId}' in parents and (mimeType='image/png' or mimeType='image/jpeg' or mimeType='image/tiff')`,
fields: 'files(id)',
}, (err, res) =>
{
if (err)
{
console.error(err);
return;
}
const files = res.data.files;
files.forEach((file) =>
{
console.log(file.id);
// Please put above script of the files.forEach method by modifying ``id`` to ``file.id``.
});
});
In this next modification, entire runSample()
was modified.在下一个修改中,整个
runSample()
被修改了。
function runSample()
{
// Put the folder ID including files you want to convert.
const folderId = "### folderId ###";
// Retrieve file list.
drive.files.list(
{
pageSize: 1000,
q: `'${folderId}' in parents and (mimeType='image/png' or mimeType='image/jpeg' or mimeType='image/tiff')`,
fields: 'files(id)',
}, (err, res) =>
{
if (err)
{
console.error(err);
return;
}
const files = res.data.files;
// Retrieve each file from the retrieved file list.
files.forEach((file) =>
{
const params = {
fileId: file.id,
resource:
{
mimeType: 'application/vnd.google-apps.document',
parents: ['### folderId ###'],
},
fields: 'id',
};
// Convert a file
drive.files.copy(params, (err, res) =>
{
if (err)
{
console.error(err);
return;
}
console.log(res.data.id);
});
});
});
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.