[英]PHP regex parse data between [ ]
I've a string that I want to convert to a valid JSON and then json_decode it. 我有一个字符串,我想将其转换为有效的JSON,然后对其进行json_decode。 Here's what the string looks like:
字符串如下所示:
[
['test', 'lol'],
['test2', 'lol2']
]
[
['test32', 'loDl'],
['test32', 'loDl2']
]
[
['tes23t', 'loDEl'],
['testDE2', 'lolDE2']
]
I want to get only the data between each 我只想获取每个之间的数据
[
]
so the result might to be: 因此结果可能是:
['test', 'lol'],
['test2', 'lol2'],
['test32', 'loDl'],
['test32', 'loDl2'],
['tes23t', 'loDEl'],
['testDE2', 'lolDE2']
so I think that I need to use regex
and preg_split
, here's what I did: 所以我认为我需要使用
regex
和preg_split
,这是我做的:
$jsons = preg_split('/\]\s*(?=\[)/', $data, null);
$jsond = "";
foreach ($jsons as $json) {
$json .= "";
$jsond .= $json;
}
return $jsond;
but It's not working, I still can't have the data beetween each [ ]
但它不起作用,我仍然无法在每个
[ ]
获取数据
How can I do that ? 我怎样才能做到这一点 ?
Thanks in advance 提前致谢
PS: here's the real full string https://paste.ee/r/RN7rK PS:这是真正的完整字符串 https://paste.ee/r/RN7rK
The string you pointed to has valid JSON on each line. 您指向的字符串在每行上都有有效的JSON。 However, all the lines together do not represent one valid JSON.
但是,所有行都不代表一个有效的JSON。
I propose to manipulate the data in a minimal way to make the whole text JSON with a simple regular expression. 我建议以最小的方式处理数据,以使用简单的正则表达式制作整个文本JSON。 If the original data is in $data , then create the JSON as follows:
如果原始数据位于$ data中 ,则按如下所示创建JSON:
$json = preg_replace('/(\])\](\R)\[/', '$1,$2', $data);
This will remove both the closing bracket at the end of a line, and the opening one at the start of the next line. 这将删除行尾的右括号,以及下一行开头的左括号。 Instead a comma is inserted.
而是插入一个逗号。 The result will be valid JSON, as the opening bracket right at the start now matches with the very final closing bracket.
结果将是有效的JSON,因为开头的右括号现在与最后的括号匹配。
I took some representative text from your data: 我从您的数据中提取了一些代表性的文字:
$data = '[["s","13","shelves_norja","49500","0","1","1","#ffffff,#F7EBBC","Beige Bookcase","For nic naks and books.","","5","true","-1","false","","1","true","0","0","0","false"],["s","117","table_plasto_round*9","45508","0","2","2","#ffffff,#533e10","Round Dining Table","Hip plastic furniture","","-1","false","-1","false","","1","false","0","0","0","false"]]
[["s","118","table_plasto_square*9","45508","0","1","1","#ffffff,#533e10","Occasional Table","Hip plastic furniture","","-1","false","-1","false","","1","false","0","0","0","false"],["s","119","chair_plasto*9","45508","0","1","1","#ffffff,#533e10,#ffffff,#533e10","Chair","Hip plastic furniture","","-1","false","-1","false","","1","false","0","1","0","false"],["s","120","carpet_standard*6","48082","0","3","5","#777777","Floor Rug","Available in a variety of colors","","105","true","-1","false","","1","true","1","0","0","false"],["s","121","chair_plasty*1","45508","0","1","1","#ffffff,#8EB5D1,#ffffff,#8EB5D1","Plastic Pod Chair","Hip plastic furniture","","-1","false","-1","false","","1","false","0","1","0","false"]]';
It just has two lines, to limit the data a bit. 它只有两行,以限制数据。 Now the above code produces this result, pretty printed:
现在上面的代码产生了这个结果,漂亮地打印出来:
[
[
"s",
"13",
"shelves_norja",
"49500",
"0",
"1",
"1",
"#ffffff,#F7EBBC",
"Beige Bookcase",
"For nic naks and books.",
"",
"5",
"true",
"-1",
"false",
"",
"1",
"true",
"0",
"0",
"0",
"false"
],
[
"s",
"117",
"table_plasto_round*9",
"45508",
"0",
"2",
"2",
"#ffffff,#533e10",
"Round Dining Table",
"Hip plastic furniture",
"",
"-1",
"false",
"-1",
"false",
"",
"1",
"false",
"0",
"0",
"0",
"false"
],
[
"s",
"118",
"table_plasto_square*9",
"45508",
"0",
"1",
"1",
"#ffffff,#533e10",
"Occasional Table",
"Hip plastic furniture",
"",
"-1",
"false",
"-1",
"false",
"",
"1",
"false",
"0",
"0",
"0",
"false"
],
[
"s",
"119",
"chair_plasto*9",
"45508",
"0",
"1",
"1",
"#ffffff,#533e10,#ffffff,#533e10",
"Chair",
"Hip plastic furniture",
"",
"-1",
"false",
"-1",
"false",
"",
"1",
"false",
"0",
"1",
"0",
"false"
],
[
"s",
"120",
"carpet_standard*6",
"48082",
"0",
"3",
"5",
"#777777",
"Floor Rug",
"Available in a variety of colors",
"",
"105",
"true",
"-1",
"false",
"",
"1",
"true",
"1",
"0",
"0",
"false"
],
[
"s",
"121",
"chair_plasty*1",
"45508",
"0",
"1",
"1",
"#ffffff,#8EB5D1,#ffffff,#8EB5D1",
"Plastic Pod Chair",
"Hip plastic furniture",
"",
"-1",
"false",
"-1",
"false",
"",
"1",
"false",
"0",
"1",
"0",
"false"
]
]
If you have a lot of data and have to preserve memory, you can use something like this: 如果您有大量数据并且必须保留内存,则可以使用以下方法:
function genArrayFromFHandler($fh) {
$state = 0; // (1: main brackets, 2: nested brackets, 3: quotes)
while(!feof($fh)) {
$c = fgetc($fh);
switch($c):
case '[':
if ($state) $array = [];
$state++;
break;
case ']':
if ($state == 2) yield $array;
$state--;
break;
case '"':
if ($state == 2) {
$state++;
$item = '';
} else {
$state--;
$array[] = $item;
}
break;
default:
if ($state == 3) $item .= $c;
endswitch;
}
}
try {
$fh = fopen('yourfile.txt', 'r');// or die ('error opening file');
if (!$fh) throw new Exception('File open failed.');
foreach (genArrayFromFHandler($fh) as $arr) {
// do all what you need with the array here
print_r($arr);
}
fclose($fh);
} catch (Exception $e) {
echo $e->getMessage() . PHP_EOL;
}
This isn't a fast method but the memory footprint is very low since the file is never fully loaded in memory. 这不是一种快速的方法,但是由于文件从未完全加载到内存中,因此内存占用非常少。
Note I don't know what you want to do with your data and if converting them to JSON is a good idea (JSON is useful to share data between apps, but if you want something you can easily and efficiently querying, it's better to use a database.) 请注意,我不知道您想对数据做什么,如果将它们转换为JSON是个好主意(JSON对于在应用程序之间共享数据很有用,但是如果您希望可以轻松高效地进行查询,最好使用数据库。)
This is also possible solution: 这也是可能的解决方案:
$data = preg_replace('/\s+/', '', $data);
$data = str_replace("[[", "[", $data);
$jsons = str_replace("]]", "]", $data);
echo $jsons 回声$ jsons
['test','lol'],['test2','lol2']['test32','loDl'],['test32','loDl2']['tes23t','loDEl'],['testDE2','lolDE2']
<?php
$data = "[
['test', 'lol'],
['test2', 'lol2']
]
[
['test32', 'loDl'],
['test32', 'loDl2']
]
[
['tes23t', 'loDEl'],
['testDE2', 'lolDE2']
]";
$matches = array();
preg_match_all('/\[(.*?)\]/', $data, $matches);
$json = '';
foreach ($matches[0] as &$val)
{
$json .= $val.',';
}
$json = substr($json, 0, strlen($json)-1);
?>
will output: 将输出:
['test', 'lol'],['test2', 'lol2'],['test32', 'loDl'],['test32', 'loDl2'],['tes23t', 'loDEl'],['testDE2', 'lolDE2']
Is this what you want? 这是你想要的吗?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.