简体   繁体   English

在数组php中拆分段落

[英]split paragraphs in array php

I have an array of this form: 我有这种形式的数组:

 $steps = array (0=> "the sentence one. the sentence two. the sentence three.",
            1=> "the sentence for. the sentence 5");

and I want to have an array $steps like this: 我想要一个数组$steps像这样:

 $steps = array (0 => "the sentence one.",
            1 => "the sentence two.",
             .
             .
             4 =>"the sentence for."
             );

I tried to use explode and implode but I did not succeed. 我尝试使用explode和内implode但未成功。

You can split your strings in your existing array using (?<=\\.\\s)(?=\\w) regex and then iterate over all the matches using foreach loop and keep adding all the splitted strings in an array. 您可以使用(?<=\\.\\s)(?=\\w)正则表达式在现有数组中拆分字符串,然后使用foreach循环遍历所有匹配项,并继续将所有拆分后的字符串添加到数组中。 Check this PHP code, 检查此PHP代码,

$steps = array (0=> "the sentence one. the sentence two. the sentence three.",
        1=> "the sentence for. the sentence 5");
$arr = array();
foreach ($steps as $s) {
    $mat = preg_split('/(?<=\.\s)(?=\w)/', $s);
    foreach($mat as $m) {
        array_push($arr,$m);
    }
}
print_r($arr);

Prints, 打印,

Array
(
    [0] => the sentence one. 
    [1] => the sentence two. 
    [2] => the sentence three.
    [3] => the sentence for. 
    [4] => the sentence 5
)

This assumes that a new sentence starts after a dot . 这假定一个新句子在点之后开始. is followed by a space by looking at your current sample data. 通过查看当前的样本数据,在其后跟一个空格。 In case you have more complicated sample data containing dots in various forms, please post your such samples and if need be, my solution can be updated to accommodate them as well. 如果您有包含各种形式的点的更复杂的样本数据,请发布您的样本,如果需要,我的解决方案也可以进行更新以适应它们。

Let me know if this works for you preg_split("/\\. (?=[AZ])/", join(" ", $steps)); 让我知道这是否对您preg_split("/\\. (?=[AZ])/", join(" ", $steps));

Your target array : 您的目标数组:

$steps = array (
    0 => "The sentence one. The sentence two. The sentence three.",
    1 => "The sentence for. The sentence 5"
);

$steps_unified = preg_split("/\. (?=[A-Z])/", join(" ", $steps));

print_r ($steps_unified);

You will get: 你会得到:

Array ( 
    [0] => The sentence one 
    [1] => The sentence two 
    [2] => The sentence three 
    [3] => The sentence for 
    [4] => The sentence 5 
)

If we use proper grammar, lines should end with a '.' 如果我们使用适当的语法,则行应以“。”结尾。 and begin with a space and a Capital latter word. 并以空格和大写的后一个词开头

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM