简体   繁体   中英

Python xgettext merges gettext and ngettext strings, breaking translation lookup

Running:

> xgettext test.py -o out.pot

On the Python snippet test.py:

def main(num):
    gettext("TEST")
    ngettext(num, "TEST", "TESTS")

Produces a pot file with the following row (translated strings are in the po file):

#: test.py:3 test.py:4
msgid "TEST"
msgid_plural "TESTS"
msgstr[0] "TEST-SINGLE"
msgstr[1] "TEST-PLURAL"

After turning this into a po file and then a mo file. I cannot get translations for gettest("TEST") calls.

> ngettext("TEST", "TESTS", 1)
> TEST-SINGLE
> gettext("TEST")
> TEST

I am using the standard gettext package for Python. I am not sure if these merging behaviour is expected, but it seems to destroy the ability to look up translations for for non pluralized strings. Is there a way to avoid this?

I was thinking of hacking up a fallback for gettext, to try a ngettext call if the first one fails. That seems very hacky though.

The problem appears to stem from the way that the gettext package looks up translations. For gettext and ugettext calls, it simply looks inside the catalog for _catalog['TEST'], and does not search for _catalog[("TEST", 0)].

I do not believe this to be the correct behaviour since xgettext decides to merge the two strings, but I cannot find anything in the documentation to prove one way or the other.

To solve this, I am monkey patching in two replacement methods for gettext and ugettext, that will fallback on a (message, 0) catalog lookup if the simple lookup fails.

The two entries are exactly the same ie singular entry is the same as your plural one. To demonstrate:

msgid "Test"
msgstr "Toets"

msgid "Test"
msgid_plural "Tests"
msgstr[0] "Toets"
msgstr[1] "Toetse"

Then use msgfmt to compile it:

msgfmt test.po 
bob.po:4: duplicate message definition...
bob.po:2: ...this is the location of the first definition
msgfmt: found 1 fatal error

Gettext uses the msgid as the key for your string so it will see these singular and plural strings are duplicates.

I came across the same problem in a recent project. In my opinion, this might be considered as a bug in the python gettext module. And others share this opinion since there is an open issue for this on Python bugtracker, since 2013... I proposed a patch but it may take a while until it is included in a release.

Using ngettext("TEST", "TESTS", 1) instead of ngettext("TEST") is a pretty simple workaroud of course. Not completely satisfactory, but it works...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM