简体   繁体   English

无法使用 MLCP 实用程序将大文件导入 MarkLogic 数据库

[英]Fail to import large files size use MLCP utilities to MarkLogic database

I have a large pdf file size 1GB fail to load into MarkLogic.我有一个 1GB 的大 pdf 文件无法加载到 MarkLogic 中。 Is there the way for mlcp split the large file into small files, then merge back into single file pdf after loading into database. mlcp 有没有办法将大文件拆分成小文件,然后在加载到数据库后合并回单个文件 pdf。

skipp record () in file:/data2022/ABO2022-129.pdf, reason: the file size too large: 13040600 use streaming option. skipp record() in file:/data2022/ABO2022-129.pdf,原因:文件太大:13040600 使用流式选项。

MarkLogic does not really care about the size of the binary. MarkLogic 并不关心二进制文件的大小。 At that size, by default it will just be stored in the large-binary directory under the forest as a regular file and treated in the system like any other binary.在那个大小下,默认情况下它将作为常规文件存储在林下的大型二进制目录中,并在系统中像任何其他二进制文件一样处理。 No, there are no tools to break-apart the content - nor is there really any likely valid reason to do this for binaries - unless you hit some documented maximum size for MarkLogic or your filesystem or other system resource.不,没有工具可以分解内容——也没有任何可能的正当理由对二进制文件执行此操作——除非你达到了 MarkLogic 或你的文件系统或其他系统资源的一些记录的最大大小。

The error you see is not a MarkLogic error.您看到的错误不是 MarkLogic 错误。 It is an MLCP imposed maximum size related to memory management.它是 MLCP 强加的与内存管理相关的最大尺寸。 Assuming that you are reading this file from disk and not an MLCP command from server->server, then the error message already suggests your next fix.. the streaming option of MLCP.假设您正在从磁盘读取此文件而不是来自服务器-> 服务器的 MLCP 命令,那么错误消息已经建议您的下一个修复.. MLCP 的streaming选项。 It basically streams from disk to server and does not need to build the whole document node into memory locally.它基本上是从磁盘流式传输到服务器,不需要将整个文档节点构建到本地内存中。

For details, see section 4.13.5 here: MLCP User Guide - 4.13.5 Reducing Memory Consumption With Streaming有关详细信息,请参阅此处的第 4.13.5 节: MLCP 用户指南 - 4.13.5 通过流式传输减少内存消耗

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM