简体   繁体   English

正则表达式匹配目录模式

[英]Regex to match Table of contents pattern

Considering 考虑到

NN = number/digit
x = any single letter

I want to match these patterns: 我想匹配这些模式:

1. NN
2. NNx
3. NN.NN
4. NN.NNx
5. NN.NN.NN
6. NN.NN.NNx

Example that needs to be match: 需要匹配的示例:

1. 20
2. 20a
3. 20.20
4. 20.20a
5. 20.20.20
6. 20.20.20a

Right now I am trying to use this regex: 现在我正在尝试使用这个正则表达式:

\b\d+\.?\d+\.?\d+?[a-z]?\b

But if fails. 但如果失败了。

Any help would be greatly appreciate, thanks! 任何帮助都会非常感谢,谢谢! XD XD

EDIT: 编辑:

I am matching this: 我匹配这个:

<fn:footnote fr="10.23.20a">    (Just a sample)

Now I have a regex that will extract the '10.23.20a' 现在我有一个正则表达式将提取'10 .23.20a'

Now I will check if this value will be valid, the 6 examples above will be the only string that will be accepted. 现在我将检查此值是否有效,上面的6个示例将是唯一将被接受的字符串。

This examples are invalid: 这个例子无效:

1. 20.a
2. 20a.20.20
3. etc.

Many thanks for your help men! 非常感谢你的帮助! :D :d

You always have \\d+ , which is one or more digits. 你总是有\\d+ ,这是一个或多个数字。 So you require at least three digits. 所以你需要至少三位数。 Try grouping the digits with their periods: 尝试使用句点对数字进行分组:

^\d+(?:[.]\d+){0,2}[a-z]?$

The ?: is just an optimization (and a good practice) that suppresses capturing . ?:只是一种抑制捕获的优化(和一种好的做法)。 [.] and \\. [.]\\. are completely interchangeable, but I prefer the readability of the former. 是完全可以互换的,但我更喜欢前者的可读性。 Choose whatever you like best. 选择你最喜欢的。

If you actually want to capture the numbers and the letter, there two options: 如果你真的想要捕获数字和字母,有两个选项:

^(?<first>\d+)(?:[.](?<second>\d+))?(?:[.](?<third>\d+))?(?<letter>[a-z])?$

Note that the important point is to group a period and the digits together and make them optional together. 请注意,重要的一点是将句点和数字组合在一起,并将它们组合在一起。 You could as well use unnamed groups, it doesn't really matter. 你也可以使用未命名的组,这并不重要。 However, if you use my version, you can now access the parts through (for instance) 但是,如果您使用我的版本,您现在可以通过(例如)访问这些部件

match.Groups["first"].Value

where match is a Match object returned by Regex.Match or Regex.Matches for example. 其中matchRegex.MatchRegex.Matches返回的Match对象。

Alternatively, you can use .NET's feature of capturing multiple values with one group: 或者,您可以使用.NET的功能,即使用一个组捕获多个值:

^(?<d>\d+)(?:[.](?<d>\d+){0,2}(?<letter>[a-z])?$

Now match.Groups["d"].Captures will contain a list of all captured numbers (1 to 3). 现在match.Groups["d"].Captures将包含所有捕获的数字的列表(1到3)。 And match.Groups["letter"].Value will still contain the letter if it was there. 并且match.Groups["letter"].Value如果存在,则match.Groups["letter"].Value仍将包含该字母。

尝试这个

^\d+(?:(?:\.\d+)*[a-z]?)$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM