简体   繁体   中英

What is the best Regex pattern that would catch all text in a .php page that is wrapped within a gettext function (examples below)

Here is some code that could be in an php page with some variations (from most important to least) that should be accounted for:

 <p><?= $translator->gettext('string example') ?></p>
 <p><?= sprintf($translator->gettext("string example n2"), Dave) ?></p>
 <p><?= $translator->ngettext('string example', 'string examples', 2) ?></p>
 <p><?= sprintf($translator->ngettext("string example n2", "string examples n2", 2), 2) ?></p>
 <p><?= $translator->pgettext('Context1', 'string example') ?></p>
 <p><?= translator->npgettext('Context2', 'string example', 'string examples', 2) ?></p>

Ideally the expected output would look something like this:

array(2) {
  [0]=>
  array(4) {
    [0]=> string(44) "<?= $translator->gettext('string example') ?>"
    [1]=> string(44) "<?= $translator->gettext('string example n2') ?>"
    [2]=> string(44) "<?= $translator->ngettext('string example', 'string examples', 2) ?>"
    [3]=> string(44) "<?= $translator->ngettext('string example n2', 'string examples n2', 2) ?>"
    [4]=> string(44) "<?= $translator->pgettext('Context1', 'string example') ?>"
    [5]=> string(44) "<?= $translator->npgettext('Context2', 'string example', 'string examples', 2) ?>"
  }
  [1]=>
  array(4) {
    [0]=> array(1) { [0]=> string(13) "string example" }
    [1]=> array(1) { [0]=> string(13) "string example n2" }
    [2]=> array(2) { 
          [0]=> string(13) "string example" 
          [1]=> string(13) "string examples"
        }
    [3]=> array(2) { 
          [0]=> string(13) "string example n2" 
          [1]=> string(13) "string examples n2"
        }
    [4]=> array(1) { 
          [0]=> string(13) "Context1" 
          [1]=> string(13) "string example"
        }
    [5]=> array(2) { 
          [0]=> string(13) "Context2" 
          [1]=> string(13) "string example"
          [2]=> string(13) "string examples"
        }
  }
}

Basically the first array would have the context to know which gettext function is wrapping the text. the second array would just be the wrapped text. So that it can be used for auto-translation.

For now, this is what i have tried:

preg_match_all('/<\?= \$translator->gettext\(\'([^\']+)\'\) \?>/', $contents, $matches, PREG_PATTERN_ORDER);

But this pattern only gets me the first instance: <p><?= $translator->gettext('string example')?></p> and misses all others.

Edit: Also, if Regex cannot solve this problem, then what can?

Regex can do the job just fine. Here are patterns for the first 2 cases:

gettext:

preg_match_all('/\$translator\s*->\s*gettext\s*\(\s*(\"([^\"]+)\"|\'([^\']+)\'\s*)\)/', $contents, $gettextMatches, PREG_PATTERN_ORDER|PREG_OFFSET_CAPTURE );

ngettext:

preg_match_all('/\$translator\s*->\s*ngettext\s*\(\s*(\"([^\"]+)\"|\'([^\']+)\')\s*,\s*(\"([^\"]+)\"|\'([^\']+)\')\s*,\s*\.*([^\)]+\s*)\)/', $contents, $ngettextMatches, PREG_PATTERN_ORDER|PREG_OFFSET_CAPTURE);

The ouput will need to be formatted depending on the need. I can construct the patterns for the pgettext or npgettext cases if somebody needs them and doesn't know how. Just comment or message me and i help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM