正则表达式 - 排除字符串的负前瞻

Question

I am trying to find (and replace with something else) in a text all parts which我试图在文本中找到（并用其他东西替换）所有部分

start with '/'从...开始 '/'
ends with '/'以。。结束 '/'
between the two /'s there can be anything, except the strings '.'在两个 / 之间可以有任何东西，除了字符串 '.' and '..'.和 '..'。

(For your info, I am searching for and replacing directory and file names, hence the '.' and '..' should be excluded.) （为了您的信息，我正在搜索和替换目录和文件名，因此应该排除“.”和“..”。）

This is the regular expression I came up with:这是我想出的正则表达式：

/(?!\.|\.\.)([^/]+)/

The second part第二部分

([^/]+)

matches every sequence of characters, '/' excluded.匹配每个字符序列，不包括“/”。 There are no character restrictions required, I am simply interpreting the input.不需要字符限制，我只是解释输入。

The first part第一部分

(?!\.|\.\.)

uses the negative lookahead assertion to exclude the strings '.'使用否定的前瞻断言来排除字符串 '.' and '..'.和 '..'。

However, this doesn't seem to work in PHP with mb_ereg_replace().但是，这在带有 mb_ereg_replace() 的 PHP 中似乎不起作用。

Can somebody help me out?有人可以帮帮我吗？ I fail to see what's wrong with my regex.我看不出我的正则表达式有什么问题。

Thank you.谢谢你。

Answer 1

POSIX regex probably don't have support for negative lookaheads. POSIX 正则表达式可能不支持负前瞻。 (I may be wrong though) （虽然我可能错了）

Anyway since PCRE regex are usually faster than POSIX I think you can use PCRE version of the same function since PCRE supports utf8 as well using u flag.无论如何，由于 PCRE 正则表达式通常比 POSIX 更快，我认为您可以使用相同 function 的 PCRE 版本，因为 PCRE 也支持 utf8 以及使用u标志。

Consider this code as a substitute:考虑将此代码作为替代：

preg_replace('~/(?!\.|\.\.)([^/]+)/~u', "", $str);

EDIT: Even better is to use:编辑：更好的是使用：

preg_replace('~/(?!\.)([^/]+)/~u', "", $str);

Answer 2

This is a little verbose, but it definitely does work:这有点冗长，但它确实有效：

#/((\.[^./][^/]*)|(\.\.[^/]+)|([^.][^/]*))/#
^  |------------| |---------| |---------|
|        |             |               |
|        |        text starting with   |
|        |        two dots, that isn't |
|        |             "." or ".."     |
|  text starting with                  |
|  a dot, that isn't                text not starting
|  "." or ".."                         with a dot
|
delimiter

Does not match:不匹配：

hi
//
/./
/../

Does match:是否匹配：

/hi/
/.hi/
/..hi/
/... / /... /

Have a play around with it on http://regexpal.com/ .在http://regexpal.com/上玩一下。

I wasn't sure whether or not you wanted to allow // .我不确定您是否要允许// 。 If you do, stick * before the last / .如果这样做，请在最后一个/之前粘贴* 。

Answer 3

I'm not against regex, but I would have done this instead:我不反对正则表达式，但我会这样做：

function simplify_path($path, $directory_separator = "/", $equivalent = true){
  $path = trim($path);
  // if it's absolute, it stays absolute:
  $prepend = (substr($path,0,1) == $directory_separator)?$directory_separator:"";
  $path_array = explode($directory_separator, $path);
  if($prepend) array_shift($path_array);
  $output = array();
  foreach($path_array as $val){
    if($val != '..' || ((empty($output) || $last == '..') && $equivalent)) {
      if($val != '' && $val != '.'){
        array_push($output, $val);
        $last = $val;
      }
    } elseif(!empty($output)) {
        array_pop($output);
    }
  }
  return $prepend.implode($directory_separator,$output);
}

Tests:测试：

echo(simplify_path("../../../one/no/no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three"));
// =>  /../../one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three"));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/"));
// =>  ../../../one/two/three

I thought that it would be better to return an equivalent string, so I respected the ocurrences of .. at the begining of the string.我认为返回一个等效的字符串会更好，所以我尊重..在字符串开头的出现。

If you dont want them, you can call it with the third parameter $equivalent = false:如果你不想要它们，你可以用第三个参数 $equivalent = false 来调用它：

echo(simplify_path("../../../one/no/no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/", "/", false));
// =>  one/two/three

Answer 4

/(?.(\.|\.\.)/)([^/]+)/ This will allow ... as a valid name. /(?.(\.|\.\.)/)([^/]+)/这将允许...作为有效名称。

正则表达式 - 排除字符串的负前瞻

问题描述

4 个解决方案

解决方案1
4 2011-06-14 22:44:44

EDIT: Even better is to use:编辑：更好的是使用：

解决方案2
3 2011-06-14 23:01:06

解决方案3
1 2011-06-14 23:14:16

解决方案4
0 2011-06-14 22:38:17

正则表达式 - 排除字符串的负前瞻

问题描述

4 个解决方案

解决方案1 4 2011-06-14 22:44:44

EDIT: Even better is to use:编辑：更好的是使用：

解决方案2 3 2011-06-14 23:01:06

解决方案3 1 2011-06-14 23:14:16

解决方案4 0 2011-06-14 22:38:17

解决方案1
4 2011-06-14 22:44:44

解决方案2
3 2011-06-14 23:01:06

解决方案3
1 2011-06-14 23:14:16

解决方案4
0 2011-06-14 22:38:17