[英]Regex between two nth position characters
I'm trying to fetch some data depending from a text string that lies between two characters (_) but could be a word in a nth position. 我正在尝试根据位于两个字符(_)之间的文本字符串来获取一些数据,但可能是第n个位置的单词。
Currently I have the following 目前我有以下内容
!((?:.*?(_)){2})_(.+?)$
working on the following data 处理以下数据
D20_Mbps_U10_Mbps_TC4_P
where I would expect to get 我希望得到的地方
U10
but get nothing as the first part captures 但是第一部分捕获时却什么也没得到
D20_Mbps_
and thus leaves nothing for the second part to capture 因此,第二部分没有留下任何东西
I've tried 我试过了
_\s*(.*?)(?=\s*_)
But this only gives me the first occurance where I need it to be nth position. 但这只是让我第一次出现在我需要它成为第n位置的地方。 Where I can supply n at runtime.
我可以在运行时提供n。
any ideas? 有任何想法吗?
Thanks 谢谢
Let me try answering this in detail. 让我试着详细回答这个问题。
When you want to match some Nth occurrence of a substring within a delimited string, you should really think of some String.Split
function. 如果要在分隔字符串中匹配某个子串的第N次出现,您应该考虑一些
String.Split
函数。 In your case, splitting with _
and getting the values you need is a trivial task. 在您的情况下,使用
_
分割并获取所需的值是一项微不足道的任务。
Now, when you cannot use a programming means to extract that value, you can only do this with a limiting quantifier , grouping and capturing (in Java and .NET, it is possible to achieve the same even without capturing). 现在,当你不能使用编程方法来提取该值时,你只能通过限制量词 ,分组和捕获来实现这一点(在Java和.NET中,即使没有捕获也可以实现相同的目标)。
So, the main idea is to match 0 or more characters other than your delimiter and then match the delimiters itself, and then repeat the same N-1 times. 因此,主要思想是匹配分隔符以外的0个或更多字符,然后匹配分隔符本身,然后重复相同的 N-1次。 Then, just match the delimiter again and capture following non-delimiter characters.
然后,再次匹配分隔符并捕获以下非分隔符字符。
^(?:[^_]*_){2}([^_]*)
See demo . 见演示 。 Group 1 will contain
U10
. 第1组将包含
U10
。
Or another variation : 或另一种变化 :
^(?:[^_]*_){2}([^_]*)_(.+)$
This will capture the 3rd _
-delimited element into Group 1. Group 2 in this case is the 4th+ elements, the rest of the string up to the end. 这将捕获第三
_
-delimited元件进入第1组第2组在此情况下是4 +元件,该字符串的剩余部分到最后。
Note that in some regex flavors {
and (
must be escaped (vim, sed with non-EGREP versions, etc.). 请注意,在一些正则表达式中
{
和(
必须被转义(vim,使用非EGREP版本等)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.