简体   繁体   中英

xgettext - How to extract strings split by nulls

There's an issue in extracting strings (with xgettext) from gst-plugins-base where a string has null delimiters -

static const gchar genres[] =
"Blues\000Classic Rock\000Country\000Dance\000Disco\000Funk\000Grunge\000"
"Hip-Hop\000Jazz\000Metal\000New Age\000Oldies\000Other\000Pop\000R&B\000"
"Rap\000Reggae\000Rock\000Techno\000Industrial\000Alternative\000Ska\000"
"Death Metal\000Pranks\000Soundtrack\000Euro-Techno\000Ambient\000Trip-Hop"
"\000Vocal\000Jazz+Funk\000Fusion\000Trance\000Classical\000Instrumental\000"
"Acid\000House\000Game\000Sound Clip\000Gospel\000Noise\000Alternative Rock"
"\000Bass\000Soul\000Punk\000Space\000Meditative\000Instrumental Pop\000"
"Instrumental Rock\000Ethnic\000Gothic\000Darkwave\000Techno-Industrial\000"
"Electronic\000Pop-Folk\000Eurodance\000Dream\000Southern Rock\000Comedy"
"\000Cult\000Gangsta\000Top 40\000Christian Rap\000Pop/Funk\000Jungle\000"
"Native American\000Cabaret\000New Wave\000Psychedelic\000Rave\000Showtunes"
"\000Trailer\000Lo-Fi\000Tribal\000Acid Punk\000Acid Jazz\000Polka\000"
"Retro\000Musical\000Rock & Roll\000Hard Rock\000Folk\000Folk/Rock\000"
"National Folk\000Swing\000Bebob\000Latin\000Revival\000Celtic\000Bluegrass"
"\000Avantgarde\000Gothic Rock\000Progressive Rock\000Psychedelic Rock\000"
"Symphonic Rock\000Slow Rock\000Big Band\000Chorus\000Easy Listening\000"
"Acoustic\000Humour\000Speech\000Chanson\000Opera\000Chamber Music\000"
"Sonata\000Symphony\000Booty Bass\000Primus\000Porn Groove\000Satire\000"
"Slow Jam\000Club\000Tango\000Samba\000Folklore\000Ballad\000Power Ballad\000"
"Rhythmic Soul\000Freestyle\000Duet\000Punk Rock\000Drum Solo\000A Capella"
"\000Euro-House\000Dance Hall\000Goa\000Drum & Bass\000Club-House\000"
"Hardcore\000Terror\000Indie\000BritPop\000Negerpunk\000Polsk Punk\000"
"Beat\000Christian Gangsta Rap\000Heavy Metal\000Black Metal\000"
"Crossover\000Contemporary Christian\000Christian Rock\000Merengue\000"
"Salsa\000Thrash Metal\000Anime\000Jpop\000Synthpop";

I'm using xgettext-0.21 to extract the strings -

xgettext -a --no-wrap ./gst-libs/gst/tag/gstid3tag.c -o -

I'm getting only one of the strings -

#: gst-libs/gst/tag/gstid3tag.c:51
msgid "Blues"
msgstr ""

While I should get also "Classic Rock", "Country", "Dance", etc...

Is there any other way to extract those strings? Maybe some other tool or by using specific flags with the xgettext command?

There is no way to extract this string with xgettext and that is by design. And even if there was a way, there are no tools available to edit po file with entries containing null bytes.

The solution is to assemble the string with the null bytes at runtime or compile time. The latter would require a helper script that generates the source file containing the genre list.

An example in Perl:

#! /usr/bin/env perl

use strict;

# Stub gettext that just returns the argument.
sub gettext {
    shift;
}

my $genres = join '\\000', (
    gettext('Blues'),
    gettext('Classic Rock'),
    gettext('Country'),
    gettext('Dance'),
);

print <<EOF;
static const gchar genres[] = "$genres";
EOF

Running the script will produce the required snippet in C. And feeding it as an additional source file to xgettext will add all genres to your po file:

$ xgettext --omit-header -o - genres.pl
#: genres.pl:11
msgid "Blues"
msgstr ""

#: genres.pl:12
msgid "Classic Rock"
msgstr ""

#: genres.pl:13
msgid "Country"
msgstr ""

#: genres.pl:14
msgid "Dance"
msgstr ""

You can do that, of course, in every other language that xgettext supports, not just in Perl. Pick the one that is easiest to integrate into your build system.

Just using a different delimiter (for example "Blues:Classic Rock:...") not only has escaping issues but would also result in a po file that is awkward to translate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM