[英]PHP parsing text to a structured Json
I have a text like this:我有这样的文字:
some text Xª 1234567-89.0123.45.6789 (YZ) 01/01/2011 Esbjörn Svensson 02/02/2022 Awesome Trio Wª 0987654-32.1098.76.5432 (KBoo) 07/09/2013 Some Full Name 09/07/2017 Observation 12/12/2018 some text that I don't want to keep Xª 4335678-98.7123.95.5689 09/10/2010 Name Here 08/09/2020 Observation and more text to delete
I need a structured Json like this:我需要一个结构化的 Json 像这样:
{
"data":
{
"Team": "Xª",
"ID": "1234567-89.0123.45.6789",
"Type": "(YZ)",
"Date 1": "01/01/2011",
"Name": "Esbjörn Svensson",
"Date 2: "02/02/2022",
"Obs": "Awesome Trio",
"Date 3": ""
},
{
"Team": "Wª",
"ID": "0987654-32.1098.76.5432",
"Type": "(KBoo)",
"Date 1": "07/09/2013",
"Name": "Some Full Name",
"Date 2: "09/07/2017",
"Obs": "Observation",
"Date 3": "12/12/2018"
},
{
"Team": "Xª",
"ID": "4335678-98.7123.95.5689",
"Type": "",
"Date 1": "09/10/2010",
"Name": "Name Here",Name Here
"Date 2: "08/09/2020",
"Obs": "Observation",
"Date 3": ""
}
}
I searched a lot of code here, but I can't get it to work the way I need it.我在这里搜索了很多代码,但我无法让它按照我需要的方式工作。 I tried to split the text where there is a blank space and the "ª" character, but it didn't work.
我试图在有空格和“ª”字符的地方分割文本,但它没有用。
foreach($textsource as &$lista) {
$y = implode(' ',$lista);
$x = preg_split(' ', $y);
$delimiter = '/\ª/';
$childIndex = array_keys(preg_grep($delimiter, $x));
$chunks = [];
$final = [];
for ($i=0; $i<count($childIndex); $i++) {
$chunks[$i]['begin'] = $childIndex[$i];
if (isset($childIndex[$i+1])) {
$chunks[$i]['len'] = $childIndex[$i+1]-$childIndex[$i];
}
}
foreach ($chunks as $chunk) {
if (isset($chunk['len'])){
$final[] = array_slice($x, $chunk['begin'], $chunk['len']);
} else {
$final[] = array_slice($x, $chunk['begin']);
}
}
echo "<pre>";
print_r($final);
echo "</pre>";
I appreciate any help.我很感激任何帮助。
So I tried to solve this, here is your working soluiton .所以我试图解决这个问题,这是你的工作解决方案。 btw your json is not valid.
顺便说一句,您的 json 无效。 check that with jsonlint.
用 jsonlint 检查。
$text = "some text Xª 1234567-89.0123.45.6789 (YZ) 01/01/2011 Esbjörn Svensson 02/02/2022 Awesome Trio Wª 0987654-32.1098.76.5432 (KBoo) 07/09/2013 Some Full Name 09/07/2017 Observation 12/12/2018 some text that I don't want to keep Xª 4335678-98.7123.95.5689 09/10/2010 Name Here 08/09/2020 Observation and more text to delete";
$arr = explode("ª", $text);
$team_arr = array_map(function ($team){ return substr($team, -1)."ª"; }, $arr);
array_shift($arr);
array_pop($team_arr);
$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);
$t = "01/01/2011 Esbjörn Svensson 02/02/2022";
$regEx = '/(\d{2})\/(\d{2})\/(\d{4})/';
preg_match_all($regEx, $t, $result);
$res = [];
$start = 0;
$end = count($arr);
for($i = 1; $i < $end; $i++){
$obj = $arr[$i];
$temp_obj_arr = explode(' ', trim($obj));
preg_match('#\((.*?)\)#', $obj, $match);
$type = (!empty($match[0]) ? $match[0] : "");
preg_match_all('/(\d{2})\/(\d{2})\/(\d{4})/', $obj, $dates);
$date1 = (!empty($dates[0][0]) ? $dates[0][0] : "");
$date2 = (!empty($dates[0][1]) ? $dates[0][1] : "");
$date3 = (!empty($dates[0][2]) ? $dates[0][2] : "");
$tname = explode($date1." ", $obj);
$char_arr = str_split($tname[1]);
$name = '';
foreach($char_arr as $ch){
if (is_numeric($ch)) {
break;
} else {
$name .=$ch;
}
}
$tname = explode($date2." ", $obj);
$char_arr = str_split($tname[1]);
$obs = '';
foreach($char_arr as $ch){
if (is_numeric($ch)) {
break;
} else {
$obs .=$ch;
}
}
$tkey = $i;
$tkey--;
$obj = [];
$obj['Team'] = $team_arr[$tkey];
$obj['ID'] = $temp_obj_arr[0];
$obj['Type'] = $type;
$obj['Date 1'] = $date1;
$obj['Name'] = $name;
$obj['Date 2'] = $date2;
$obj['Obs'] = $obs;
$obj['Date 3'] = $date3;
$res[] = $obj;
}
$json_res = json_encode($res, true);
print_r($json_res);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.