简体   繁体   English

正则表达式匹配除特定给定字符串之外的任何内容(包括空字符串)

[英]Regex to match anything (including the empty string) except a specific given string

I'd like to test whether a string contains "Kansas" followed by anything other than " State" . 我想测试一个字符串是否包含"Kansas"其后是" State"以外的任何内容。

Examples: 例子:

"I am from Kansas"          true
"Kansas State is great"     false
"Kansas is a state"         true
"Kansas Kansas State"       true
"Kansas State vs Kansas"    true
"I'm from Kansas State"     false
"KansasState"               true

For PCRE , I believe the answer is this: 对于PCRE ,我相信答案是这样的:

'Kansas(?! State)'

But Mysql's REGEXP doesn't seem to like that. 但Mysql的REGEXP似乎并不喜欢这样。

ADDENDUM: Thanks to David M for generalizing this question: How to convert a PCRE to a POSIX RE? 附录:感谢David M推广这个问题: 如何将PCRE转换为POSIX RE?

MySQL doesn't have lookaheads. MySQL没有前瞻性。 A workaround is to make two tests: 解决方法是进行两项测试:

WHERE yourcolumn LIKE '%Kansas%'
  AND yourcolumn NOT LIKE '%Kansas State%'

I used LIKE here instead of RLIKE because once you split it up like this, regular expressions are no longer required. 我在这里使用了LIKE而不是RLIKE因为一旦你将它拆分成这样,就不再需要正则表达式了。 However if you still need regular expressions for other reasons you can still use this same technique. 但是,如果由于其他原因仍然需要正则表达式,您仍然可以使用相同的技术。

Note that this does not match 'Kansas Kansas State' as you requested. 请注意,这与您要求的“堪萨斯州堪萨斯州”不符。

Update: If matching 'Kansas Kansas State' is that important then you can use this ugly regular expression that is supported by MySQL: 更新:如果匹配'堪萨斯州堪萨斯州'那么重要,那么你可以使用MySQL支持的这个丑陋的正则表达式:

'Kansas($|[^ ]| ($|[^S])| S($|[^t])| St($|[^a])| Sta($|[^t])| Stat($|[^e]))'

Oops: I just noticed Kip already updated his comment with a solution very similar to this. 哎呀:我刚注意到Kip已经用一个非常类似于此的解决方案更新了他的评论。

This should work, assuming look-ahead assertions are allowed in MySQL regexes. 这应该有效,假设在MySQL正则表达式中允许前瞻性断言。

/Kansas(?! State)/

Edit : OK, this is super ugly, but it works for me in Perl and doesn't use a look-ahead assertion: 编辑 :好的,这是非常丑陋的,但它在Perl中适用于我,并且不使用前瞻性断言:

/Kansas(([^ ]|$)| (([^S]|$)|S(([^t]|$)|t(([^a]|$)|a(([^t]|$)|t([^e]|$))))))/

More efficient than that large regex (depending, of course, on your data and the quality of the engine) is 比大型正则表达式更高效(当然,取决于您的数据和引擎的质量)

WHERE col LIKE '%Kansas%' AND
  (col NOT LIKE '%Kansas State%' OR
  REPLACE(col, 'Kansas State', '') LIKE '%Kansas%')

If Kansas usually appears in the form 'Kansas State', though, you may find this better: 如果堪萨斯州通常以“堪萨斯州”的形式出现,你可能会发现这更好:

WHERE col LIKE '%Kansas%' AND
  REPLACE(col, 'Kansas State', '') LIKE '%Kansas%'

This has the added advantage of being easier to maintain. 这具有易于维护的附加优点。 It works less well if Kansas is common and text fields are large. 如果堪萨斯很常见且文本字段很大,那么它的效果就不那么好了。 Of course you can test these on your own data and tell us how they compare. 当然,您可以根据自己的数据测试这些数据并告诉我们它们的比较方式。

This is ugly, but here you go: 这很难看,但是你走了:

You might not need to expand the regex all the way to the end, depending on whether your input might include something like 'I need to get this man to surgery in Kansas Stat!' 你可能不需要将正则表达式一直扩展到最后,这取决于你的输入是否包括“我需要让这个人在堪萨斯统计中接受手术!”。

mysql> select x,x RLIKE 'Kansas($|[^ ]| ($|[^S])| S($|[^t])| St($|[^a])| Sta($|[^t])| Stat($|[^e]))' AS result from examples;
+------------------------+--------+
| x                      | result |
+------------------------+--------+
| I am from Kansas       |      1 |
| Kansas State is great  |      0 |
| Kansas is a state      |      1 |
| Kansas Kansas State    |      1 |
| Kansas State vs Kansas |      1 |
| I'm from Kansas State  |      0 |
| KansasState            |      1 |
+------------------------+--------+
7 rows in set (0.00 sec)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式可以匹配VIM中的任何字符串,多个字符串,包括换行符的非贪婪字符串 - Regex to match anything, string, multiple,strings including newline non greedy in VIM SQL Server中的正则表达式,匹配除字母以外的所有内容 - Regex in SQL Server, match anything except a letter MySQL查询使用LIKE和通配符来匹配除空字符串之外的任何内容 - MySQL query using LIKE and a wildcard to match anything but empty string Select 使用正则表达式匹配的字符串的特定部分 - Select specific portion of string using Regex match Postgresql正则表达式匹配字符串与子字符串 - Postgresql regex match string with substring STRING_AGG 除了来自另一列的特定字符串 - STRING_AGG except specific string from another column MySQL错误“作为参数给出的空字符串! 字符” - MySQL error “empty string given as argument for ! character” 在Teradata查询中不以给定后缀结尾的字符串的正则表达式 - Regex for string not ending with given suffix in teradata query 使用 REGEXP_REPLACE 替换开始字符串和结束字符串(包括特殊字符)之间的任何内容 - replace anything between a start string and end string (including special char) using REGEXP_REPLACE 正则表达式匹配,将字符串数据插入SQL数据库 - Regex match, insert string data to SQL database
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM