简体   繁体   English

获取字符串中的php正则表达式

[英]Get the php regex in string

i have for example the following string例如,我有以下字符串

@kirbypanganja[Kirby Panganja] elow @kyraminerva[Kyra] test @watever[watever ever evergreen] @kirbypanganja[Kirby Panganja] elow @kyraminerva[Kyra] test @watever[watever evergreen]

I want to get the substring that match with @username[Full Name], Im really new on regex thing.我想获得与@username[Full Name] 匹配的子字符串,我对正则表达式非常陌生。 Im using the ff code:我使用 ff 代码:

$mention_regex = '/@([A-Za-z0-9_]+)/i';
preg_match_all($mention_regex, $content, $matches);
var_dump($matches);

where the $content is the string above.其中 $content 是上面的字符串。 what should be the correct regex so that i can have the array @username[Full Name] format?正确的正则表达式应该是什么,以便我可以拥有数组@username[Full Name] 格式?

You can use:您可以使用:

@[^]]+]

ie: IE:

$string = "@kirbypanganja[Kirby Panganja] elow @kyraminerva[Kyra] test @watever[watever ever evergreen]";  
preg_match_all('/@[^]]+]/', $string, $result);
print_r($result[0]);

Output :输出

Array
(
    [0] => @kirbypanganja[Kirby Panganja]
    [1] => @kyraminerva[Kyra]
    [2] => @watever[watever ever evergreen]
)

PHP Demo PHP 演示

Regex Demo and Explanation正则表达式演示和说明

Regex: /@[A-Za-z0-9_]+\\[[a-zA-Z\\s]+\\]/正则表达式: /@[A-Za-z0-9_]+\\[[a-zA-Z\\s]+\\]/

/@[A-Za-z0-9_]+\\[[a-zA-Z\\s]+\\]/ this will match /@[A-Za-z0-9_]+\\[[a-zA-Z\\s]+\\]/这将匹配

Example: @thanSomeCharacters[Some Name Can contain space]示例: @thanSomeCharacters[Some Name Can contain space]

Try this code snippet here在此处尝试此代码片段

<?php
$content='@kirbypanganja[Kirby Panganja] elow @kyraminerva[Kyra] test @watever[watever ever evergreen]';
$mention_regex = '/@[A-Za-z0-9_]+\[[a-zA-Z\s]+\]/i';
preg_match_all($mention_regex, $content, $matches);
print_r($matches);

I'll start with a very direct, one-liner method that I believe is best and then discuss the other options...我将从一种我认为最好的非常直接的单行方法开始,然后讨论其他选项......

Code ( Demo ):代码(演示):

$string = "@kirbypanganja[Kirby Panganja] elow @kyraminerva[Kyra] test @watever[watever ever evergreen]";

$result = preg_split('/]\K[^@]+/', $string, 0, PREG_SPLIT_NO_EMPTY);

var_export($result);

Output:输出:

array (
  0 => '@kirbypanganja[Kirby Panganja]',
  1 => '@kyraminerva[Kyra]',
  2 => '@watever[watever ever evergreen]',
)

Pattern ( Demo ):模式(演示):

]     #match a literal closing square bracket
\K    #forget the matched closing square bracket
[^@]+ #match 1 or more non-at-signs

My pattern takes 12 steps, which is the same step efficiency as Pedro's pattern.我的模式需要 12 个步骤,这与 Pedro 的模式相同的步骤效率。

There are two benefits to the coder by using preg_split() :使用preg_split()对编码器有两个好处:

  1. it does not return true/false nor require an output variable like preg_match_all() which means it can be used as a one-liner without a condition statement.它不返回真/假,也不需要像preg_match_all()这样的输出变量,这意味着它可以用作没有条件语句的单行。
  2. it returns a one-dimensional array, versus a two-dimensional array like preg_match_all() .它返回一个一维数组,而不是像preg_match_all()这样的二维数组。 This means the the entire returned array is instantly ready to unpack without any subarray accessing.这意味着整个返回的数组立即准备好解包,无需任何子数组访问。

In case you are wondering what the 3rd and 4th parameters are in preg_split() , the 0 value means return an unlimited amount of substrings.如果您想知道preg_split()的第三个和第四个参数是什么, 0值意味着返回无限数量的子字符串。 This is the default behavior, but it is used as a placeholder for parameter 4. PREG_SPLIT_NO_EMPTY effectively removes any empty substrings that would have been generated by trying to split at the start or end of the input string.这是默认行为,但它用作参数 4 的占位符。 PREG_SPLIT_NO_EMPTY有效地删除了通过尝试在输入字符串的开头或结尾进行拆分而生成的任何空子字符串。


That concludes my recommended method, now I'll take a moment to compare the other answers currently posted on this page, and then present some non-regex methods which I do not recommended.我推荐的方法到此结束,现在我将花点时间比较当前发布在此页面上的其他答案,然后介绍一些我不推荐的非正则表达式方法。

The most popular and intuitive method is to use a regex pattern with preg_match_all() .最流行和最直观的方法是在preg_match_all()使用正则表达式模式。 Both Sahil and Pedro have opted for this course of action. Sahil 和 Pedro 都选择了这种做法。 Let's compare the patterns that they've chosen...让我们比较一下他们选择的模式...

Sahil's pattern /@[A-Za-z0-9_]+\\[[a-zA-Z\\s]+\\]/i correctly matches the desired substrings in 18 steps, but uses unnecessary redundancies like using the i modifier/flag despite using A-Za-z in the character class. Sahil 的模式/@[A-Za-z0-9_]+\\[[a-zA-Z\\s]+\\]/i在 18 个步骤中正确匹配所需的子字符串,但使用了不必要的冗余,例如使用i修饰符/标志尽管在字符类中使用了A-Za-z Here is a demo .这是一个演示 Also, [A-Za-z0-9_] is more simply expressed as \\w .此外, [A-Za-z0-9_]更简单地表示为\\w

Pedro's pattern /@[^]]+]/ correctly matches the desired string in 12 steps. Pedro 的模式/@[^]]+]/分 12 步正确匹配所需的字符串。 Here is a demo .这是一个演示

By all comparisons, Pedro's method is superior to Sahil's because it has equal accuracy, higher efficiency, and increased pattern brevity.通过所有比较,Pedro 的方法优于 Sahil 的方法,因为它具有相同的准确性、更高的效率和更高的模式简洁性。 If you want to use preg_match_all() , you will not find a more refined regex pattern than Pedro's.如果您想使用preg_match_all() ,您将找不到比 Pedro 更精致的正则表达式模式。

That said, there are other ways to extract the desired substrings.也就是说,还有其他方法可以提取所需的子字符串。 First, the more tedious way that doesn't involve regex that I would never recommend...首先,我永远不会推荐的不涉及正则表达式的更乏味的方法......

Regex-free method: strpos() & substr()无正则表达式方法: strpos() & substr()

$result = [];
while (($start = strpos($string, '@')) !== false) {
    $result[] = substr($string, $start, ($stop = strpos($string, ']') + 1) - $start);
    $string = substr($string, $stop);
}
var_export($result);

Coders should always entertain the idea of a non-regex method when dissecting strings, but as you can see from this code above, it just isn't sensible for this case.编码人员在剖析字符串时应始终考虑使用非正则表达式方法的想法,但正如您从上面的代码中看到的那样,在这种情况下它是不明智的。 It requires four function calls on each iteration and it isn't the easiest thing to read.它需要在每次迭代中调用四个函数,而且它不是最容易阅读的东西。 So let's dismiss this method.所以让我们放弃这个方法。

Here is another way that provides the correct result...这是提供正确结果的另一种方法......

$result = [];
foreach (explode('@', $string) as $v) {
    if ($v) {
        $result[] = '@' . substr($v, 0, strrpos($v, ']') + 1);
    }
}

It makes fewer function calls compared to the previous regex-free method, but it still too much handling for such a simple task.与之前的 regex-free 方法相比,它的函数调用更少,但对于这样一个简单的任务,它仍然需要处理太多。

At this point, it is the clear that the most sensible methods should be using regex.在这一点上,很明显,最明智的方法应该是使用正则表达式。 And there is nothing wrong with choosing preg_match_all() -- if this were my project, I might elect to use it.选择preg_match_all()并没有错——如果这是我的项目,我可能会选择使用它。 However, it is important to consider the direct-ness of preg_split() .但是,重要的是要考虑preg_split()的直接性。 This function is just like explode() but with the ability to use a regex pattern.这个函数就像explode()但是可以使用正则表达式。 This question is a perfect stage for preg_split() because the substrings that should be omitted can also be used as the delimiter between the desired substrings.这个问题是preg_split()的完美阶段,因为应该省略的子字符串也可以用作所需子字符串之间的分隔符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM