[英]Parse Tokens with Regex in PHP
我正在寻找一个令牌文件,它看起来像下面这样,以获取令牌名称/值对。 令牌/值/嵌套关系已经定义,因此我无法更改令牌文件的制作方式。 似乎上下文无关的语法可能是最好的方法,但是我没有编写或实现语法的经验。 可以用正则表达式来做吗? 嵌套多行标记(例如Master1,Servant2)没有运气。
;token1 = I am a top level single line token
;token2 {
I am a top level
multiline line token
}
master1 {
;servant1 = I am Master1, Servant1 single line token
;servant2 {
I am Master1, Servant2.
A mulit line token.
}
;servant3 = I am Master1, Servant3
}
master2 {
;servant1 = I am Master2, Servant1
;servant2 {
I am Master2, Servant2
A mulit line token.
}
;servant3 = I am Master2, Servant3
}
PHP具有使用以下功能标记字符串的功能
strtok
将字符串(str)拆分为较小的字符串(令牌),每个令牌由令牌中的任何字符定界。 也就是说,如果您有一个类似“这是示例字符串”的字符串,则可以使用空格字符作为标记,将此字符串标记为各个单词。 这是一个相当简单的行遍历解析器(我最初试图为其编写一个正则表达式,但是缺少前导;
在多行主计算机开始时确实使它变得更加困难(没有它;
缺少它就很合理了)。易于编写)。我放弃了并写下了):
function getTokens($string) {
$string = trim($string);;
$lines = explode("\n", $string);
$data = array();
$key = '';
$open = 0;
$buffer = '';
foreach ($lines as $line) {
$line = trim($line);
if (empty($line)) {
continue;
} elseif (strpos($line, '}') === 0) {
$open--;
if ($open == 0) {
$data[$key] = getTokens($buffer);
$buffer = '';
} elseif ($open < 0) {
throw new Exception('Unmatched }');
} else {
$buffer .= "\n" . $line;
}
} elseif ($open > 0) {
if (strpos($line, '{') !== false) {
$open++;
}
$buffer .= "\n" . $line;
} elseif ($line[0] == ';') {
if (strpos($line, "=") !== false) {
list ($key, $value) = explode("=", $line, 2);
$key = trim(substr($key, 1));
$value = trim($value);
$data[$key] = $value;
} elseif (strpos($line, "{") !== false) {
$open++;
list ($key, $value) = explode("{", $line, 2);
$key = trim(substr($key, 1));
} else {
throw new Exception('Unmatched token ;');
}
} elseif (strpos($line, '{') !== false) {
$open++;
list ($key, $value) = explode("{", $line, 2);
$key = trim($key);
} else {
$buffer .= "\n" . $line;
}
}
if ($open > 0) {
throw new Exception('Unmatched {');
} elseif (empty($data) && !empty($buffer)) {
return trim($buffer);
}
return $data;
}
当我输入您的字符串作为输入时,我得到:
Array(
"token1" => "I am a top level single line token",
"token2" => "I am a top level
multiline line token",
"master1" => Array(
"servant1" => "I am Master1, Servant1 single line token",
"servant2" => "I am Master1, Servant2.
A mulit line token.",
"servant3" => "I am Master1, Servant3",
),
"master2" => Array(
"servant1" => "I am Master2, Servant1",
"servant2" => "I am Master2, Servant2
A mulit line token.",
"servant3" => "I am Master2, Servant3",
),
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.