Given the following sentence:
The is 10. way of doing this. And this is 43. street.
I want preg_split() to give this:
Array (
[0] => "This is 10. way of doing this"
[1] => "And this is 43. street"
)
I am using:
preg_split("/[^\d+]\./i", $sentence)
But this gives me:
Array (
[0] => "This is 10. way of doing thi"
[1] => "And this is 43. stree"
)
As you can see, the last character of each sentence is removed. I know why this happens, but I don't know how to prevent it from happening. Any ideas? Can lookaheads and lookbehinds help here? I am not really familiar with those.
You want to use a negative assertion for that:
preg_split("/(?<!\d)\./i",$sentence)
The difference is that [^\\d]+
would become part of the match, and thus split
would remove it. The (?!
assertion is also matched, but is "zero-width", meaning it does not become part of the delimiter match, and thus won't be thrown away.
To explode your string on literal dots that are not preceded by a digit, match the non-digit, then reset the fullstring match with \\K
(meaning "keep" from here), then match the "disposable" characters -- the literal dot and zero or more spaces.
Code: ( Demo )
$string = 'The is 10. way of doing this. And this is 43. street.';
var_export(
preg_split('~\D\K\. *~', $string, 0, PREG_SPLIT_NO_EMPTY)
);
or ( Demo )
var_export(
preg_split('~(?<!\d)\. *~', $string, 0, PREG_SPLIT_NO_EMPTY)
);
or ( Demo )
var_export(
preg_split('~(?<=\D)\. *~', $string, 0, PREG_SPLIT_NO_EMPTY)
);
Output: (all clean, no trailing dots, no trailing spaces, no unexpected lost characters)
array (
0 => 'The is 10. way of doing this',
1 => 'And this is 43. street',
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.