简体   繁体   English

如何在TCL中的regexp中匹配2位数字?

[英]how to match 2 digit in regexp in TCL?

I have the following string need to match using regexp: 我有以下字符串需要使用regexp进行匹配:

"The value is 0x0208 and the type is INTERNATION"

I want to get the digit 02 and 08, and store them into different two variable, I use the following regexp: 我想获取数字02和08,并将它们存储到不同的两个变量中,我使用以下regexp:

repexp "0x(\[0-9]+)\[^\\n]+INTERNALION" "The value is 0x0208 and the type is INTERNATION" whole first second

it can not get the second one, how to fix it? 它无法获得第二个,该如何解决?

First, use curly braces for regular expressions, it makes them much easier to read because you don't have to use extra backslashes. 首先,将花括号用于正则表达式,这使它们更易于阅读,因为您不必使用额外的反斜杠。

Second, use \\d for digits to make the expression a little shorter, which also improves readability. 其次,对数字使用\\ d可使表达式短一些,这也提高了可读性。

Searching for pairs of digits 搜索数字对

In your description you say you want to search for two pairs of digits following 0x . 在您的描述中,您说要搜索0x两对数字。 Here's a simple way to do that: 这是一种简单的方法:

{0x(\d\d)(\d\d)}

This says "0x, followed by two digits that we capture, followed by two digits that we capture" 这表示“ 0x,然后是我们捕获的两位数,然后是我们捕获的两位数”

Searching for hexadecimal characters 搜索十六进制字符

Typically, hex numbers are preceeded by 0x , which makes me think you are actually trying to parse a hex number. 通常,十六进制数字以0x ,这使我认为您实际上是在尝试解析十六进制数字。 If that's true, you need to search for more than just digits. 如果是这样,您不仅需要搜索数字。 To match a hex digit you need to use [0-9a-f] . 要匹配十六进制数字,您需要使用[0-9a-f] Once a pattern gets slightly long (eg: [0-9a-f] vs. \\d ), you don't want to keep repeating it, so another way to say "two of these" is to use {2} rather than repeating the pattern. 一旦模式变得稍长(例如: [0-9a-f]\\d ),您就不想继续重复它,所以说“其中两个”的另一种方法是使用{2}而不是重复图案。

Putting that all together, to match two groups of two hex digits you could use something like this: 将所有内容放在一起,以匹配两个两个十六进制数字的组,您可以使用如下所示的内容:

{0x([0-9a-f]{2})([0-9a-f]{2})}

Dealing with upper and lower case 处理大小写

Note that this pattern assumes the hex digits are lowercase. 请注意,此模式假定十六进制数字为小写。 If your particular data might have uppercase letters there are at least four ways to handle that: 如果您的特定数据可能包含大写字母,则至少有四种方法可以处理:

  1. use the -nocase option to the regexp command regexp命令使用-nocase选项
  2. use both upper and lowercase characters in the expression 在表达式中同时使用大写和小写字符
  3. convert the string to lowercase before matching 匹配之前将字符串转换为小写
  4. add embedded options to turn off case sensitivity 添加嵌入式选项以关闭区分大小写

Of those, the last is likely the least obvious solution, so I'll present it here. 在这些中,最后一个可能是最不明显的解决方案,因此在这里我将介绍它。

Tcl expressions can have a special sequence at the very start of the pattern that modifies how the regular expression works. Tcl表达式在模式的开头可以有一个特殊的序列,该序列可以修改正则表达式的工作方式。 In this case we want to tell it to ignore case. 在这种情况下,我们要告诉它忽略大小写。 The way to do that is to add (?i) at the start of the pattern: 这样做的方法是在模式的开头添加(?i)

{(?i)0x([0-9a-f]{2})([0-9a-f]{2})}

For more information on embedded options, see the metasyntax section of the re_syntax man page . 有关嵌入式选项的更多信息,请参见re_syntax手册页的metasyntax部分

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM