Given a string, I want an array of strings containing words, each preceded by any non-word characters.
Example input string:
one "two" (three) -four-
The words in the string may be anything, even gibberish, with any amount of punctuation or symbols.
What I would like to see:
array: one "two " (three ) -four -
Essentially, for each match the last thing is a word, preceded by anything left over from the previous match.
I will be using this in PHP. I have tried various combinations of preg_match_all() and preg_split(), with patterns containing many variations of "\\w", "\\b", "[^\\w]" and so on.
The Bigger Picture
How can I place a * after each word in the string for searching purposes?
If you just want to add an asterisk after each "word" you could do this:
<?php
$test = 'one "two" (three) -four-';
echo preg_replace('/(\w+)/', "$1*", $test);
?>
You can use a negative lookahead to split on word boundaries, like this:
$array = preg_split( '/(?!\w)\b/', 'one "two" (three) -four-');
A print_r( $array);
gives you the exact output desired:
Array ( [0] => one [1] => "two [2] => " (three [3] => ) -four [4] => - )
Here is an example of how to find a word with regex in PHP.
<?php
$subject = "abcdef";
$pattern = '/^def/';
preg_match($pattern, substr($subject, 3), $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>
[^\w]*(\b\w*\b)?
----- ----------
| |
| |-> Matches a word 0 or 1 time
|-> Matches 0 to many characters except [a-zA-Z0-9_]
You need to match!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.