简体   繁体   English

通过Java正则表达式验证简单SQL

[英]Validate simple sql by java regex

I'm trying to validate a these 2 simple sql queries 我正在尝试验证这两个简单的SQL查询

String sql1 = "select * from table".toLowerCase();
String sql2 = "select value from table".toLowerCase();

using this pattern 使用这种模式

String pattern = "(select)(\\s)([\\*|\\w+])(\\s)(from)(\\s\\w+)";

then I print the value 然后我打印值

System.out.println(sql1.matches(pattern)); // true
System.out.println(sql2.matches(pattern)); // false

the first one is ok, but I'm getting false in the second statement. 第一个是可以的,但是第二个语句中我是虚假的。 Can someone help? 有人可以帮忙吗?

You introduced square brackets in a group, the following line: 您在一组中引入了方括号,如下所示:

String pattern = "(select)(\\s)([\\*|\\w+])(\\s)(from)(\\s\\w+)";

Should be: 应该:

String pattern = "(select)(\\s)(\\*|\\w+)(\\s)(from)(\\s\\w+)";

Inside the square brackets + and | 在方括号+| are considered as literal characters: 被视为文字字符:

[\\\\*|\\\\w+] means a single character that is * , | [\\\\*|\\\\w+]表示单个字符*| , + or a letter. +或字母。

It's because of that you have putted the star character and word character modifier inside a character class. 因此,您已经将星号和单词字符修饰符放在了一个字符类中。

When you want to choose between 2 separate word you shouldn't use character class for both.Instead you can use an logical OR ( | ) and a capture group,like following: 如果要在2个单独的单词之间进行选择,则不要同时使用字符类,而是可以使用逻辑OR( | )和捕获组,如下所示:

(\\*|\\w+)

Also note that when you put the | 另请注意,当您将| or + inside the character class your regex engine will escape them. +在字符类中,您的正则表达式引擎会将其转义。

In addition if you want to match the whole of sentence you don't need to put all the words within a capture group.You can use anchors ^ and $ for specifying the start and end of the string: 另外,如果要匹配整个句子,则无需将所有单词都放在捕获组中。可以使用锚点^$来指定字符串的开头和结尾:

"^select\\s(?:\\*|\\w+)\\sfrom\\s\\w+$"

(?:) is a none capturing group. (?:)是一个不捕获的组。

Read more about regular expressions http://www.regular-expressions.info/ 阅读有关正则表达式的更多信息http://www.regular-expressions.info/

I think the problem is because the + in regex is greedy, so the \\w+ is consuming all the words. 我认为问题是因为正则表达式中的+是贪婪的,所以\\ w +占用了所有单词。 so it's consuming "value" "from" and "table". 因此它消耗了“值”,“来自”和“表”。 You can make it "lazy" by putting question mark after '+', like: 您可以通过在“ +”之后加上问号来使其变得“懒惰”,例如:

([\\*|\\w+?])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM