简体   繁体   English

如何在PHP中的预定义字符串之间提取文本

[英]How to extract text between predefined strings in php

We're changing or translation system from GetText to MYSQL Database. 我们正在将系统从GetText更改或转换为MYSQL数据库。 I want to put all the translations strings & translation ID from the original ".po" file into database. 我想将原始“ .po”文件中的所有翻译字符串和翻译ID放入数据库中。

For this I need to read the file and loop through each line, which is easy. 为此,我需要读取文件并循环浏览每一行,这很容易。 The difficult part is when I see "msgid" or "msgstr" I need to extract the datas and insert this into a database. 困难的部分是当我看到“ msgid”或“ msgstr”时,我需要提取数据并将其插入数据库。

Original file looks like this : 原始文件如下所示:

msgid "inactive_ad_detail_text"
msgstr "This ad doesn't exists"
msgid "breadcrumb_search"
msgstr "Search the site"
(... etc etc ...)

How can I extract the name of the the id (msgid) and the text (msgstr) between quotation marks ? 如何提取引号之间的id(msgid)和文本(msgstr)的名称?

Also, I have some escaped text and two lines text like : 另外,我有一些转义文字和两行文字,例如:

msgid "question_fill_form"
msgstr ""
"Please fill the form"
"<br>All fields are mandatory"

or 要么

msgid "offer_contact_error"
msgstr ""
"Error detected "
"please click \"<em>restart</em>\" on the right side."

I think I need to detect [msgid "] the the last ["] quotation mark before the end-of-line but I really have no clue how to achieve in PHP. 我想我需要在行尾之前检测[msgid“]的最后一个[”]引号,但是我真的不知道如何在PHP中实现。

Thanks for you help, Lio 谢谢你的帮忙

There is a library for this. 这里有一个图书馆。 PHP-po-parser PHP-PO-解析器

// Parse a po file
$fileHandler = new Sepia\FileHandler('es.po');

$poParser = new Sepia\PoParser($fileHandler);
$entries  = $poParser->parse();
// $entries contains every entry in es.po file.

// Update entries
$msgid = 'Press this button to save';
$entries[$msgid]['msgstr'] = 'Pulsa este botón para guardar';
$poParser->setEntry($msgid, $entries[$msgid]);
// You can also change translator comments, code comments, flags...

If you don't use composer, you can include the files in order or use an autoloader to load these. 如果不使用作曲家,则可以按顺序包含文件,也可以使用自动加载器加载这些文件。

require_once('Sepia/InterfaceHandler.php');
require_once('Sepia/StringHandler.php');
require_once('Sepia/FileHandler.php');
require_once('Sepia/PoParser.php');

The solution using file , strpos and substr functions: 使用filestrpossubstr函数的解决方案:

Let's say the input file msgdata has contents: 假设输入文件msgdata具有内容:

msgid "question_fill_form"
msgstr ""
"Please fill the form"
"<br>All fields are mandatory"
msgid "offer_contact_error"
msgstr ""
"Error detected "
"please click \"<em>restart</em>\" on the right side."

Consecutive processing: 连续处理:

$lines = file('msgdata');
$result = [];

foreach ($lines as $k => $line) {
    if (strpos($line, 'msgid') === 0) {
        $result[] = ['msgid' => substr($line, strpos($line, ' '))];
    } elseif (strpos($line, 'msgstr') === 0) {
        $result[count($result)-1]['msgstr'] = substr($line, strpos($line, ' '));
    } else {
        $result[count($result)-1]['msgstr'] .= $line;
    }
}

print_r($result); 

The output: 输出:

Array
(
    [0] => Array
        (
            [msgid] =>  "question_fill_form"

            [msgstr] =>  ""
"Please fill the form"
"
All fields are mandatory"

        )

    [1] => Array
        (
            [msgid] =>  "offer_contact_error"

            [msgstr] =>  ""
"Error detected "
"please click \"restart\" on the right side."
        )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM