简体   繁体   English

正则表达式提取所有以冒号开头的单词

[英]Regular expression to extract all words starting with colon

I would like to use a regular expression to extract "bind variable" parameters from a string that contains a SQL statement. 我想使用正则表达式从包含SQL语句的字符串中提取“绑定变量”参数。 In Oracle, the parameters are prefixed with a colon. 在Oracle中,参数以冒号作为前缀。

For example, like this: 例如,像这样:

SELECT * FROM employee WHERE name = :variable1 OR empno = :variable2

Can I use a regular expression to extract "variable1" and "variable2" from the string? 我可以使用正则表达式从字符串中提取“ variable1”和“ variable2”吗? That is, get all words that start with colon and end with space, comma, or the end of the string. 也就是说,获取所有以冒号开头,以空格,逗号或字符串结尾结尾的单词。

(I don't care if I get the same name multiple times if the same variable has been used several times in the SQL statement; I can sort that out later.) (如果在SQL语句中多次使用相同的变量,我不在乎是否多次获得相同的名称;我可以稍后进行排序。)

This might work: 这可能起作用:

:\w+

This just means "a colon, followed by one or more word-class characters ". 这仅表示“冒号,后接一个或多个单词类字符 ”。

This obviously assumes you have a POSIX-compliant regular expression system, that supports the word-class syntax. 显然,这假定您拥有一个支持POSIX的正则表达式系统,该系统支持单词类语法。

Of course, this only matches a single such reference. 当然,这仅匹配单个参考。 To get both, and skip the noise, something like this should work: 为了获得两者并跳过噪音,类似这样的方法应该起作用:

(:\w+).+(:\w+)

For being able to handle such an easy case by yourself you should have a look at regex quickstart . 为了能够自己处理这种简单的情况,您应该看看regex quickstart

For the meantime use: 同时使用:

:\w+

如果您的正则表达式解析器支持单词边界,

:[a-zA-Z_0-9]\b

Try the following: 请尝试以下操作:

sed -e 's/[ ,]/\\n/g' yourFile.sql | grep '^:.*$' | sort | uniq

assuming your SQL is in a file called "yourFile.sql". 假设您的SQL位于名为“ yourFile.sql”的文件中。

This should give a list of variables with no duplicates. 这应该给出一个没有重复的变量列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM