gettext 的效率：内存翻译

Question

I have an embedded system with Flash and a very low end CPU and less RAM.我有一个带有 Flash 和非常低端 CPU 和更少 RAM 的嵌入式系统。 I wanted to know how efficient is the gettext language translation using .MO file.我想知道使用 .MO 文件进行 gettext 语言翻译的效率如何。

For doing the locale language string fetch, do every time gettext read MO file from flash OR, the complete MO binary file is first loaded into RAM, and do the locale string fetch from there ?为了进行区域设置语言字符串获取，每次 gettext 从闪存中读取 MO 文件，或者，首先将完整的 MO 二进制文件加载到 RAM 中，然后从那里获取区域设置字符串？

If the MO file (It will be large ~1Mb since there are a lot of strings) is always loaded into RAM, it will eatup my RAM.如果 MO 文件（它会很大 ~1Mb，因为有很多字符串）总是加载到 RAM 中，它会占用我的 RAM。

"

Answer 1

As MSalters said it is open source, so you could tweak it. 正如MSalters所说，它是开源的，因此您可以对其进行调整。

If you give a fuller definition of the system (as per my comment) we might be able to help more. 如果您对系统进行更完整的定义（根据我的评论），我们也许可以提供更多帮助。

If this is a deeply embedded system (the sort of stuff I do), with no OS, and no external file system of any type, the strings must all be in memory. 如果这是一个深度嵌入式系统（我所做的事情），没有操作系统，也没有任何类型的外部文件系统，则字符串必须全部位于内存中。 There will very likely be a mechanism to store those strings in flash, so that they consume no RAM. 很可能会有一种机制将这些字符串存储在闪存中，从而使它们不占用RAM。

For example, on an ARM, data structures can easily be stored in flash. 例如，在ARM上，数据结构可以轻松存储在闪存中。 To do that, you need to tell the compiler which segment of the program to store them, using something like: 为此，您需要使用以下类似方法告知编译器将其存储在程序的哪个部分：

const char mesg1[] __attribute__((section (".USER_FLASH"))) 
             = "Ciao a tutti";
const char mesg2[] __attribute__((section (".USER_FLASH"))) 
             = "Riesco a sentire la mia mente va Dave";

When the program is linked, the linker script needs to be written to place the strings into Flash, and they will not be copied to RAM. 链接程序时，需要编写链接描述文件以将字符串放入Flash，并且不会将它们复制到RAM。

Approximately how much space can you dedicate to messages? 您可以为邮件分配大约多少空间？ How much space do they take? 它们占用多少空间？

You may be fighting a well researched problem; 您可能正在研究一个经过充分研究的问题； the amount of programming effort increases exponentially as resource limits are approached. 随着接近资源限制，编程工作量成倍增加。 It may take tremendous effort to fit stuff into the final few % of memory. 将内容填充到最后几％的内存中可能需要付出巨大的努力。

Once 'obvious' tweak is to try a few simple compression techniques. 一旦“显而易见”的调整，就是尝试一些简单的压缩技术。 One might get applied on the raw messages, and uncompressed as the messages are printed. 可能会在原始消息上应用一个消息，并在消息打印时将其解压缩。

Edit: I thought your question seemed so straightforward and natural, that I had assumed the answer would be straightforward to find. 编辑：我认为您的问题似乎如此简单自然，以至于我认为答案很容易找到。

I had a look at the gettext documentation, but failed to find it there. 我查看了gettext文档，但未能在此处找到它。 I downloaded the source. 我下载了源代码。 After 10 minutes, I honestly could not tell you how it worked. 10分钟后，老实说，我无法告诉您它是如何工作的。 I can tell you it is much more complicated than I'd expected. 我可以告诉你，这比我预期的要复杂得多。 I looked at the extensive documentation. 我看了详尽的文档。 Lots of documentation on how to best organise to do translation, on how to prepare the program, on things that can cause problems. 关于如何最好地组织翻译，如何准备程序以及可能导致问题的事情的大量文档。 Very helpful insights. 非常有用的见解。 Yet I could not find any documentation explaining its overall run-time architecture. 但是我找不到任何说明其总体运行时体系结构的文档。 None. 没有。 Nothing. 没有。

My best advice is to go to the GNU gettext mailing lists, search/look and if necessary ask. 我最好的建议是转到GNU gettext邮件列表，进行搜索/查找，并在必要时询问。 The mailing list archives can be found at http://savannah.gnu.org/projects/gettext/ I apologise that I couldn't be more helpful. 邮件列表归档文件可以在http://savannah.gnu.org/projects/gettext/中找到。很抱歉，我没有更多帮助。

Answer 2

gettext is typically used with a hash table: gettext通常与哈希表一起使用：

when the user selects a language, the content of a .mo file is processed to find offsets of every translation.当用户选择一种语言时，会处理.mo文件的内容以查找每个翻译的偏移量。 Those offsets are stored in a hash table.这些偏移量存储在哈希表中。
when a translated string is to be displayed, the hash of the corresponding English string is calculated, and the offset of the translated string is found using that hash.当要显示翻译后的字符串时，计算对应的英文字符串的哈希值，并使用该哈希值找到翻译后的字符串的偏移量。

If the fhash memory in your embedded system is mapped to the address space, the English strings and the translations can be stored in the flash.如果您的嵌入式系统中的 fhash 内存映射到地址空间，则英文字符串和翻译可以存储在闪存中。 Only the hash table will need to be in RAM.只有哈希表需要在 RAM 中。 You'll need to reserve the size of one hash and one pointer per translated string.您需要为每个翻译的字符串保留一个哈希值和一个指针的大小。 If you use CRC32 as a hash and 4-byte pointers, you'll need 8kB of RAM for 1024 translated strings.如果您使用 CRC32 作为散列和 4 字节指针，则需要 8kB 的 RAM 来处理 1024 个翻译字符串。

If you don't have flash memory mapped to the address space, you'll have to either load a complete .mo file in the RAM when a language is selected, or call a flash IO routine every time you want to display a string.如果您没有将闪存映射到地址空间，则必须在选择语言时将完整的.mo文件加载到 RAM 中，或者每次要显示字符串时调用闪存 IO 例程。

gettext 的效率：内存翻译

问题描述

2 个解决方案

解决方案1
1 已采纳 2012-03-22 13:25:18

解决方案2
0 2022-01-30 13:25:40

gettext 的效率：内存翻译

问题描述

2 个解决方案

解决方案1 1 已采纳 2012-03-22 13:25:18

解决方案2 0 2022-01-30 13:25:40

解决方案1
1 已采纳 2012-03-22 13:25:18

解决方案2
0 2022-01-30 13:25:40