简体   繁体   English

在正则表达式中不重复字符的匹配字符串

[英]Matching strings that don't repeat characters in Regex

I'm a total newbie of Regex. 我是Regex的新手。 What I'm trying to do is to check if a numeric value repeats numbers. 我正在尝试做的是检查数字值是否重复数字。 They can be anywhere in the string: eg. 它们可以在字符串中的任何位置:例如。

123456789 -> would return true
987612345 -> true

but: 但:

122345678 -> would return false because it uses two times the number 2.
182345688 -> false

Is it possible to do this with Regex in PHP? 是否可以使用PHP中的Regex做到这一点?

In case you don't want to use regexes with massive recursive backtracking: 如果您不想使用带有大量递归回溯的正则表达式:

$duplicates = count(count_chars($test, 1)) < strlen($test);

Demo 演示版


Edit: 编辑:

In case you want to use a regular expression, you only need to find one duplicate and then quit: 如果要使用正则表达式,则只需查找一个重复项然后退出:

$duplicates = preg_match('/(.).*\1/', $test);

Demo 演示版

Re-appearing characters will return 1 , eg: 重新出现的字符将返回1 ,例如:

$match = preg_match_all('/(.).*\1/', '121345678', $arr, PREG_PATTERN_ORDER);

Others will return 0 , eg: 其他人将返回0 ,例如:

$match = preg_match_all('/(.).*\1/', '12345678', $arr, PREG_PATTERN_ORDER);

Therefore (I named it clean as in "non-repeating"): 因此(我将其命名为clean如“非重复”中所述):

$clean = $match == 0;

EDIT: 编辑:
Maybe for explanation: \\1 is a back-reference to the first (and in this case) only pair of () -s. 也许是为了解释: \\1是仅对第一个() -s对的反向引用。 So this regex is matched when a character is found "that was already there before that occurrence". 因此,当找到一个“在该事件之前已经存在”的字符时,将匹配此正则表达式。

If they can be anywhere in your string, it's not too easy. 如果它们可以在您的字符串中的任何位置,则并非易事。 I assume it is somehow possible using regexp. 我认为使用regexp在某种程度上是可能的。 But I recommend doing it another way: 但我建议以另一种方式执行此操作:

  • extract the single characters from the string into a array of characters 将字符串中的单个字符提取为字符数组
  • sort the array 对数组排序
  • check if two adjacent characters are the same 检查两个相邻字符是否相同

Or any equivalent technique. 或任何等效技术。 But I think this problem is a bit too complicated to be solved by regexp in an elegant manner. 但是我认为这个问题有点复杂,无法通过regexp 优雅地解决。

As for regex I'm not 100% sure, but you can do this other way: 至于正则表达式,我不确定100%,但是您可以通过其他方式执行此操作:

function hasRepeatingNumbers($number) {
    $numberArray = array_unique(str_split($number));
    if(count($numberArray) != strlen($number)) {
        return true;
    else
        return false;
}

In the example above we're removing any duplicate numbers and comparing the length of each variables. 在上面的示例中,我们将删除所有重复的数字并比较每个变量的长度。 If they're differente then it's because we removed duplicate numbers. 如果它们不同,那是因为我们删除了重复的数字。

Then you should just need to: 然后,您只需要:

if(hasRepeatingNumbers('123456789'))
    echo "No repeating numbers";
else
    echo "There are repeating numbers";

This should work just as like. 这应该像。

/(\\d)(?=.*\\1)/

Only looks for digits, matches/quits at first duplicate found. 仅查找数字,在找到第一个重复项时匹配/退出。
Warning!, this could be slow. 警告!,这可能很慢。

I imagine this would do it 我想这会做到的
if ( preg_match( '/(\\d)(?=.*\\1)/', "your string", $match) ) ..

This method might cause problems if digits 0-9 are unique and the string is 如果数字0-9是唯一的并且字符串是
very long. 很长。 Theorhetically, it would inspect 10 times the length of the string. 从理论上讲,它将检查字符串长度的10倍。

On the other hand, if you have more than 10 digits, there is at least one dup. 另一方面,如果您的位数超过10位,则至少有一个重复项。
So, in a single pass , extract up to the first 11 digits. 因此, single pass ,最多提取前11位数字。 Then you can either 那你可以
loop through the (up to 11 digits) array elements, or use a hash if PHP does that. 循环(最多11位)数组元素,或者如果PHP这样做则使用哈希。
This is the fastest method, it might be a verbose regex (11 capture buffers) but PCRE can't do variable amount of capture buffers. 这是最快的方法,可能是冗长的正则表达式(11个捕获缓冲区),但PCRE不能执行可变数量的捕获缓冲区。

Example in Perl (using a hash): Perl中的示例(使用哈希):

$_ = '12asasdf3456789 4 0 asdf 3';

my @found = /
 ^
  [^\d]*
  (\d) [^\d]*(\d?)[^\d]*(\d?)[^\d]*(\d?)[^\d]*(\d?)[^\d]*
  (\d?)[^\d]*(\d?)[^\d]*(\d?)[^\d]*(\d?)[^\d]*(\d?)[^\d]*
  (\d?)
/x;

for (@found) {
   if ($seen{$_}++) {
      print "Found a duplicate: '$_'\n";
      last;
   }
}

Output: 输出:
Found a duplicate: '4'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM