[英]How I can parse substring with regular expression ?
My example non-parsed data is 我的示例非解析数据是
"8$154#3$021308831#7$NAME SURNAME#11$2166220160#10$5383237309#52$05408166#"
I want to parse data that is between $ and # strings. 我想解析$和#字符串之间的数据。 I want to see result like that;
我想看到这样的结果;
Between 8$
and #
-> My data is 154
, 在
8$
和#
->我的数据是154
,
Between 3$
and #
-> My data is 021308831
, 在
3$
和#
->我的数据是021308831
,
Between 7$
and #
-> My data is NAME SURNAME
, 在
7$
和#
->我的数据是NAME SURNAME
,
Between 11$
and #
-> My data is 2166220160
, 在
11$
和#
->我的数据是2166220160
,
Between 10$
and #
-> My data is 5383237309
, 在
10$
和#
->我的数据是5383237309
,
Between 52$
and #
-> My data is 05408166
. 在
52$
和#
->我的数据是05408166
。
Thanks for your reply. 感谢您的回复。
(\d+\$)(.*?)#
See it on Rubular 在Rubular上看到它
You will find the first part (eg 8$
) in the capturing group 1 and the according data in the group 2. 您将在捕获组1中找到第一部分(例如
8$
),并在组2中找到相应的数据。
The brackets are responsible, that the result is sotred in those capturing groups. 方括号负责,结果被那些捕获组分类。 The
\\d+
will match at least one digit. \\d+
将至少匹配一位数字。 The .*?
.*?
is a lazy match for everything till the next #
. 直到下一个
#
为止,一切都是懒惰的比赛。
You can split into array based on #
. 您可以根据
#
拆分为数组。 With 用
String[] entries = data.Split('#');
you will get an arrays with "8$154", "3$021308831", etc. 您将得到一个包含“ 8 $ 154”,“ 3 $ 021308831”等的数组。
Now you just work with the entries and split each one at the dollar sign: 现在,您只需要处理条目并在美元符号处分割每个条目:
String[] tmp = entries[0].Split('$');
So you get 所以你得到
tmp[0] = "8";
tmp[1] = "154";
Build in some checks and you will be happy. 建立一些检查,您会很高兴的。 No need for regex here I suppose.
我想这里不需要正则表达式。
If you have "8$15$4#3$021308831" then you will get in tmp
: 如果您拥有“ 8 $ 15 $ 4#3 $ 021308831”,那么您将获得
tmp
:
tmp[0] = "8"; // your key!
tmp[1] = "15"; // data part
tmp[2] = "4"; // data part ($ is missing!)
So you would have to concat all tmp above index 1: 因此,您必须将所有tmp置于索引1之上:
StringBuilder value = new StringBuilder();
for(int i = 1; i < tmp.Length; i++)
{
if(i > 1) value.Append("$");
value.Append(tmp[i]);
}
class Program
{
static void Main(string[] args)
{
string text = "8$154#3$021308831#7$NAME SURNAME#11$2166220160#10$5383237309#52$05408166#";
string[] values = text.Split('$', '#');
for (var i = 0; i < values.Length - 1; i = i + 2)
{
Console.WriteLine("Between " + values[i] + "$ and # -> My data is " + values[i+1]);
}
Console.ReadLine();
}
}
Ok, taking stema 's expression, which works. 好吧,以STEMA的表情为准 。
using System.Text.RegularExpressions;
string nonParsed = "8$...";
MatchCollection matches = Regex.Matches(nonparsed, @"(\d+\$)(.*?)#");
StringBuilder result = new StringBuilder();
for(int i = 0; i < matches.Count; i++)
{
Match match = matches[i];
result.AppendFormat("Between {0} and #-> My data is {1}")
match.Groups[1].Value,
match.Groups[2].Value);
if (i < matches.Count - 1)
{
result.AppendLine(",");
}
else
{
result.Append(".");
}
}
return result.ToString();
Thanks to stema , this copes with the $
repeating within the value. 多亏了stema ,这才可以处理
$
在值内重复的现象。
如果要使用正则表达式,则应这样做。
\$([\w\d\s]+)\#
这将与betweel $和#匹配:
\$(.*?)#
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.