简体   繁体   English

替换大文件中的字符串

[英]Replace strings in large file

I have a server-client application where clients are able to edit data in a file stored on the server side. 我有一个服务器-客户端应用程序,客户端可以在其中编辑存储在服务器端的文件中的数据。 The problem is that the file is too large in order to load it into the memory (8gb+). 问题是文件太大,无法将其加载到内存(8gb +)中。 There could be around 50 string replacements per second invoked by the connected clients. 所连接的客户端每秒可能会有大约50个字符串替换。 So copying the whole file and replacing the specified string with the new one is out of question. 因此,复制整个文件并用新的字符串替换指定的字符串是毫无疑问的。

I was thinking about saving all changes in a cache on the server side and perform all the replacements after reaching a certain amount of data. 我正在考虑将所有更改保存在服务器端的缓存中,并在达到一定数量的数据后执行所有替换。 After reaching that amount of data I would perform the update by copying the file in small chunks and replace the specified parts. 在达到该数据量之后,我将通过将文件分成小块复制并替换指定的部分来执行更新。

This is the only idea I came up with but I was wondering if there might be another way or what problems I could encounter with this method. 这是我想到的唯一想法,但我想知道是否可能存在其他方法,或者使用此方法可能遇到什么问题。

When you have more than 8GB of data which is edited by many users simultaneously, you are far beyond what can be handled with a flatfile. 当您有超过8GB的数据同时被许多用户编辑时,您将无法使用平面文件进行处理。

You seriously need to move this data to a database. 您非常需要将此数据移至数据库。 Regarding your comment that "the file content is no fit for a database": sorry, but I don't believe you. 关于您的评论“文件内容不适合数据库”:抱歉,但我不相信您。 Especially regarding your remark that "many people can edit it" - that's one more reason to use a database. 特别是关于“许多人可以编辑”的说法-这是使用数据库的又一个原因。 On a filesystem, only one user at a time can have write access to a file. 在文件系统上,一次只能有一个用户可以访问文件。 But a database allows concurrent write access for multiple users. 但是数据库允许多个用户同时进行写访问。

We could help you to come up with a database schema, when you open a new question telling us how your data is structured exactly and what your use-cases are. 当您打开一个新问题,告诉我们数据的确切结构和用例是什么时,我们可以帮助您提出数据库模式。

您可以对数据使用某种形式的索引(在单独的文件中),以允许快速访问此巨大文件的相关部分(我们已经成功地对大型文件进行了此操作(〜200-400gb),但是正如Phillipp所述您应该将该数据移动到数据库中,尤其是用于读/写访问;某些框架(例如OSG )已经带有用于3dterrain数据的数据库后端,因此您可以在那里查看它们的工作方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM