简体   繁体   English

如何通过正则表达式从mysql数据库中选择记录

[英]How to select records from mysql database by regex

I have a regexp to validate user email address. 我有一个regexp来验证用户的电子邮件地址。

/^(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,})$/i"

With the help of active record, I want to fetch from a database all the users whose email address doesn't match this regexp. 在活动记录的帮助下,我想从数据库中获取其电子邮件地址与此正则表达式不匹配的所有用户。 I tried the following scope to achieve the desired result, but all I get is ActiveRecord::Relation . 我尝试了以下scope来实现所需的结果,但我得到的只是ActiveRecord::Relation

scope :not_match_email_regex, :conditions => ["NOT email REGEXP ?'", /^(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,})$/"]

This gives me the following query: 这给了我以下查询:

SELECT `users`.* FROM `users` WHERE (email REGEXP '--- !ruby/regexp /^(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\\-+)|([A-Za-z0-9]+\\.+)|([A-Za-z0-9]+\\++))*[A-Za-z0-9]+@((\\w+\\-+)|(\\w+\\.))*\\w{1,63}\\.[a-zA-Z]{2,})$/\n...\n')

I also tried to define this scope in the following way with the same result: 我还尝试以下列方式定义此scope ,结果相同:

scope :not_match_email_regex, :conditions => ["email REGEXP '(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,})'"]

The query it generates is: 它生成的查询是:

SELECT `users`.* FROM `users` WHERE (email REGEXP '(|(([A-Za-z0-9]+_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]+.+)|([A-Za-z0-9]+++))*[A-Za-z0-9]+@((w+-+)|(w+.))*w{1,63}.[a-zA-Z]{2,})')

How can I fetch all records that match or don't match the given regex? 如何获取与给定正则表达式匹配或不匹配的所有记录?

EDIT 12-11-30 small corrections partly according to the comment by @innocent_rifle 编辑12-11-30小修正部分根据@innocent_rifle的评论

The suggested Regexp here is trying to make the same matches as in the original question 这里建议的Regexp尝试进行与原始问题相同的匹配

1. In my solution when I first wrote it I forgot that you must escape \\ in strings because I was testing directly in MySQL. 1.在我第一次写它的解决方案时,我忘了你必须在字符串中逃脱\\因为我在MySQL中直接测试。 When discussing Regexps it's confusing to use Regexps in strings, so I will use this form instead eg /dot\\./.source which (in Ruby) will give "dot\\\\." 在讨论Regexps时,在字符串中使用Regexps会让人感到困惑,所以我将使用这个表单,例如/dot\\./.source "dot\\\\." /dot\\./.source (在Ruby中)将给出"dot\\\\." .

2. REGEXP in MySQL (manual for 5.6, tested in 5.0.67) are using "C escape syntax in strings", so WHERE email REGEXP '\\.' 2. MySQL中的REGEXP (手册为5.6,在5.0.67中测试)正在使用“字符串中的C转义语法”,因此WHERE email REGEXP '\\.' is still the same as WHERE email REGEXP '.' 仍然与WHERE email REGEXP '.'相同WHERE email REGEXP '.' , to find the character "." ,找到角色"." you must use WHERE email REGEXP '\\\\.' 你必须使用WHERE email REGEXP '\\\\.' , to achieve that you must use the code .where([ 'email REGEXP ?', "\\\\\\\\."]) . ,要实现这一点,你必须使用代码.where([ 'email REGEXP ?', "\\\\\\\\."]) It's more readable to use .where([ 'email REGEXP ?', /\\\\./.source ]) (MySQL needs 2 escapes). 使用它更具可读性.where([ 'email REGEXP ?', /\\\\./.source ]) (MySQL需要2次转义)。 However, I prefer to use .where([ 'email REGEXP ?', /[.]/.source ]) , then I don't have to worry about how many escapes you need. 但是,我更喜欢使用.where([ 'email REGEXP ?', /[.]/.source ]) ,然后我不必担心你需要多少逃脱。

3. You don't need to escape "-" in a Regexp, not when using that in [] either as long as that character is the first or the last. 3.你不需要在Regexp中转义"-" ,而不是在[]使用它,只要该字符是第一个或最后一个。


Some errors I found: it's the first regexp-or "|" 我发现的一些错误:它是第一个正则表达式 - 或“|” in you expression, and it should be as a String in the query, or using Regexp#source which I prefer. 在你的表达式中,它应该是查询中的String,或者使用我喜欢的Regexp#source。 There was also an extra quote at the end I think. 我认为最后还有一个额外的引用。 Except from that are you really sure the regexps works. 除此之外你真的确定正则表达式有效。 If you try it in the console on a string? 如果你在控制台上尝试一下字符串?

Also be aware of that you won't catch emails with NULL in db, in that case you must add (<your existing expr in parentheses>) OR IS NULL 另请注意,您不会在db中捕获带NULL的电子邮件,在这种情况下,您必须添加(<your existing expr in parentheses>) OR IS NULL

Regexp syntax in my MySQL verion. 我的MySQL版本中的Regexp语法。

I also tested what @Olaf Dietsche wrote in his suggestion, it seems that it's not needed, but it's strongly recommended to follow the standard syntax anyway ( NOT (expr REGEXP pat) or expr NOT REGEXP pat ). 我还测试了@Olaf Dietsche在他的建议中写的内容,似乎不需要它,但强烈建议遵循标准语法NOT (expr REGEXP pat)expr NOT REGEXP pat )。

I have done some checking, these things must be changed: use [A-Za-z0-9_] instead of \\w , and \\+ is not valid, you must use \\\\+ ( "\\\\\\\\+" if string), easier with [+] (in both Regexp or string). 我做了一些检查,必须更改这些内容:使用[A-Za-z0-9_]代替\\w ,而\\+无效,必须使用\\\\+"\\\\\\\\+"如果字符串), [+] (在Regexp或字符串中)更容易。

It leads to following REGEXP in MySQL 它导致在MySQL中遵循REGEXP

'^(([A-Za-z0-9]+_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]+)|([A-Za-z0-9]+[+]+))*[A-Za-z0-9]+@(([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]))*[A-Za-z0-9]{1,63}[.][a-zA-Z]{2,}$'

Small change suggestions 小改变建议

I don't understand your regexp exactly, so this is only changing your regexp without changing what it will find. 我完全不明白你的正则表达式,所以这只是改变你的正则表达式而不改变它会发现的东西。

First: change the whole string as I described above 首先:如上所述更改整个字符串

Then change 然后改变

(([A-Za-z0-9]+_+)|([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]+)|([A-Za-z0-9]+[+]+))*

to

([A-Za-z0-9]+[-+_.]+)*

and

@(([A-Za-z0-9]+-+)|([A-Za-z0-9]+[.]))*

to

@([A-Za-z]+[-.]+)*

Final code (change to ..., :conditions => ... syntax if you prefer that). 最终代码 (如果您愿意,可以更改为..., :conditions => ...语法)。 I tried to make this find the same strings as in the comment by @innocent_rifle, only adding "_" in expressions to the right of @ 我试图让它找到与 @innocent_rifle 注释中相同的字符串 ,只在@右边的表达式中添加"_"

.where([ 'NOT (email REGEXP ?)', /^([A-Za-z0-9]+[-+_.]+)*[A-Za-z0-9]+@([A-Za-z0-9]+[-._]+)*[A-Za-z0-9_]{1,63}[.][A-Za-z]{2,}$/.source ])

For validating email addresses, you might want to consider How to Find or Validate an Email Address . 要验证电子邮件地址,您可能需要考虑如何查找或验证电子邮件地址 At least, this regexp looks a bit simpler. 至少,这个regexp看起来有点简单。

According to MySQL - Regular Expressions the proper syntax is 根据MySQL - 正则表达式 ,正确的语法是

expr REGEXP pat

for a match, and 比赛,和

expr NOT REGEXP pat or NOT (expr REGEXP pat) expr NOT REGEXP patNOT (expr REGEXP pat)

for the opposite. 相反的。 Don't forget the braces in the second version. 不要忘记第二个版本中的大括号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM