[英]Regex: match everything but a specific pattern
I need a regular expression able to match everything but a string starting with a specific pattern (specifically index.php
and what follows, like index.php?id=2342343
).我需要一个能够匹配除以特定模式开头的字符串以外的所有内容的正则表达式(特别是
index.php
以及后面的内容,例如index.php?id=2342343
)。
Regex: match everything but :正则表达式:匹配所有内容,但:
foo
):foo
开头的字符串):
world.
at the end):world.
最后):
foo
):foo
的字符串):
|
symbol):|
符号的字符串):
foo
):foo
):
cat
): /cat(*SKIP)(*FAIL)|[^c]*(?:c(?!at)[^c]*)*/i
or /cat(*SKIP)(*FAIL)|(?:(?!cat).)+/is
cat
之外的任何文本): /cat(*SKIP)(*FAIL)|[^c]*(?:c(?!at)[^c]*)*/i
或/cat(*SKIP)(*FAIL)|(?:(?!cat).)+/is
(cat)|[^c]*(?:c(?!at)[^c]*)*
(or (?s)(cat)|(?:(?!cat).)*
, or (cat)|[^c]+(?:c(?!at)[^c]*)*|(?:c(?!at)[^c]*)+[^c]*
) and then check with language means: if Group 1 matched, it is not what we need, else, grab the match value if not empty(cat)|[^c]*(?:c(?!at)[^c]*)*
(或(?s)(cat)|(?:(?!cat).)*
或(cat)|[^c]+(?:c(?!at)[^c]*)*|(?:c(?!at)[^c]*)+[^c]*
) 然后检查语言意思是:如果第1组匹配,它不是我们需要的,否则,如果不为空,则获取匹配值 Demo note : the newline \\n
is used inside negated character classes in demos to avoid match overflow to the neighboring line(s).演示说明:换行符
\\n
用于演示中的否定字符类中,以避免匹配溢出到相邻行。 They are not necessary when testing individual strings.在测试单个字符串时,它们不是必需的。
Anchor note : In many languages, use \\A
to define the unambiguous start of string, and \\z
(in Python, it is \\Z
, in JavaScript, $
is OK) to define the very end of the string.锚注:在许多语言中,使用
\\A
来定义字符串的明确开头,并使用\\z
(在 Python 中是\\Z
,在 JavaScript 中, $
可以)定义字符串的最后。
Dot note : In many flavors (but not POSIX, TRE, TCL), .
点注:在许多口味中(但不是 POSIX、TRE、TCL)
.
matches any char but a newline char.匹配除换行符以外的任何字符。 Make sure you use a corresponding DOTALL modifier (
/s
in PCRE/Boost/.NET/Python/Java and /m
in Ruby) for the .
/m
.
/s
to match any char including a newline.匹配任何字符,包括换行符。
Backslash note : In languages where you have to declare patterns with C strings allowing escape sequences (like \\n
for a newline), you need to double the backslashes escaping special characters so that the engine could treat them as literal characters (eg in Java, world\\.
will be declared as "world\\\\."
, or use a character class: "world[.]"
).反斜杠注意:在必须使用允许转义序列的 C 字符串声明模式的语言中(例如
\\n
用于换行符),您需要将反斜杠加倍以转义特殊字符,以便引擎可以将它们视为文字字符(例如在 Java 中, world\\.
将被声明为"world\\\\."
,或使用字符类: "world[.]"
)。 Use raw string literals (Python r'\\bworld\\b'
), C# verbatim string literals @"world\\."
使用原始字符串文字 (Python
r'\\bworld\\b'
)、C# 逐字字符串文字@"world\\."
, or slashy strings/regex literal notations like /world\\./
. ,或像
/world\\./
这样的斜线字符串/正则表达式文字符号。
不是正则表达式专家,但我认为您可以从一开始就使用负前瞻,例如^(?!foo).*$
不应匹配以foo
开头的任何内容。
You can put a ^
in the beginning of a character set to match anything but those characters.您可以将
^
放在字符集的开头以匹配除这些字符之外的任何内容。
[^=]*
will match everything but =
将匹配所有内容,但
=
只需匹配/^index\\.php/
然后拒绝匹配它的任何内容。
In python:在蟒蛇中:
>>> import re
>>> p='^(?!index\.php\?[0-9]+).*$'
>>> s1='index.php?12345'
>>> re.match(p,s1)
>>> s2='index.html?12345'
>>> re.match(p,s2)
<_sre.SRE_Match object at 0xb7d65fa8>
I need a regex able to match everything but except a string starting with
index.php
a specific pattern (specifically index.php and what follows, like index.php?id=2342343)我需要一个能够匹配除以
index.php
开头的字符串以外的所有内容的正则表达式和特定模式(特别是 index.php 以及后面的内容,例如 index.php?id=2342343)
Use method Exec使用方法Exec
let match, arr = [], myRe = /([\\s\\S]+?)(?:index\\.php\\?id.+)/g; var str = 'http://regular-viragenia/index.php?id=2342343'; while ((match = myRe.exec(str)) != null) { arr.push(match[1]); } console.log(arr);
var myRe = /([\\s\\S]+?)(?:index\\.php\\?id=.+)/g; var str = 'http://regular-viragenia/index.php?id=2342343'; var matches_array = myRe.exec(str); console.log(matches_array[1]);
OR OTHER MATCH或其他比赛
let match, arr = [], myRe = /index.php\\?id=((?:(?!index)[\\s\\S])*)/g; var str = 'http://regular-viragenia/index.php?id=2342343index.php?id=111index.php?id=222'; while ((match = myRe.exec(str)) != null) { arr.push(match[1]); } console.log(arr);
I had this problem for multiple search and replace.我在多次搜索和替换时遇到了这个问题。 Needed a negative pattern to skip matching till the next search
需要一个否定模式来跳过匹配直到下一次搜索
import re
text = "alex ![image]dfsf(dfd.png) [image]fsdf(dfd.png) home ![image]fdsf(dfd.png) end"
replaced_text = re.sub(r'!\[image\](.*)\(.*\.png\)', '*', text)
print(replaced_text)
gave给了
alex * end
basically, the middle was swallowing till the next .png
基本上,中间一直在吞咽直到下一个
.png
Used the method https://stackoverflow.com/a/17761124/429476 by Firish and got what I wanted.使用 Firish 的方法https://stackoverflow.com/a/17761124/429476得到了我想要的。 Here the character space is not matched;
这里没有匹配到字符空间; and the next words are separated by space
并且接下来的单词由空格分隔
replaced_text = re.sub(r'!\[image\]([^ ]*)\([^ ]*\.png\)', '*', text)
and got what I wanted得到了我想要的
alex * [image]fsdf(dfd.png) home * end
grep -v in shell grep -v在外壳中
!~ in perl !〜在perl中
Please add more in other languages - I marked this as Community Wiki. 请添加其他语言的其他内容-我将此标记为社区Wiki。
How about not using regex:不使用正则表达式怎么样:
// In PHP
0 !== strpos($string, 'index.php')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.