[英]PHP regular expression to match words
Given a string, I want an array of strings containing words, each preceded by any non-word characters.给定一个字符串,我想要一个包含单词的字符串数组,每个字符串前面都有任何非单词字符。
Example input string:示例输入字符串:
one "two" (three) -four-
The words in the string may be anything, even gibberish, with any amount of punctuation or symbols.字符串中的单词可以是任何内容,甚至是乱码,带有任意数量的标点符号或符号。
What I would like to see:我想看到的:
array: one "two " (three ) -four -
Essentially, for each match the last thing is a word, preceded by anything left over from the previous match.本质上,对于每场比赛,最后一件事是一个单词,前面是前一场比赛剩下的任何东西。
I will be using this in PHP.我将在 PHP 中使用它。 I have tried various combinations of preg_match_all() and preg_split(), with patterns containing many variations of "\\w", "\\b", "[^\\w]" and so on.我尝试了 preg_match_all() 和 preg_split() 的各种组合,其中的模式包含“\\w”、“\\b”、“[^\\w]”等的许多变体。
The Bigger Picture更大的图景
How can I place a * after each word in the string for searching purposes?如何在字符串中的每个单词后放置 * 以进行搜索?
If you just want to add an asterisk after each "word" you could do this:如果您只想在每个“单词”后添加一个星号,您可以这样做:
<?php
$test = 'one "two" (three) -four-';
echo preg_replace('/(\w+)/', "$1*", $test);
?>
http://phpfiddle.org/main/code/8nr-bpb http://phpfiddle.org/main/code/8nr-bpb
You can use a negative lookahead to split on word boundaries, like this:您可以使用负前瞻在单词边界上进行拆分,如下所示:
$array = preg_split( '/(?!\w)\b/', 'one "two" (three) -four-');
A print_r( $array);
一个print_r( $array);
gives you the exact output desired:为您提供所需的确切输出:
Array ( [0] => one [1] => "two [2] => " (three [3] => ) -four [4] => - )
Here is an example of how to find a word with regex in PHP.这是一个如何在 PHP 中使用正则表达式查找单词的示例。
<?php
$subject = "abcdef";
$pattern = '/^def/';
preg_match($pattern, substr($subject, 3), $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>
[^\w]*(\b\w*\b)?
----- ----------
| |
| |-> Matches a word 0 or 1 time
|-> Matches 0 to many characters except [a-zA-Z0-9_]
You need to match!你需要匹配!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.