简体   繁体   中英

How I can parse substring with regular expression ?

My example non-parsed data is

"8$154#3$021308831#7$NAME SURNAME#11$2166220160#10$5383237309#52$05408166#"

I want to parse data that is between $ and # strings. I want to see result like that;

Between 8$ and # -> My data is 154 ,
Between 3$ and # -> My data is 021308831 ,
Between 7$ and # -> My data is NAME SURNAME ,
Between 11$ and # -> My data is 2166220160 ,
Between 10$ and # -> My data is 5383237309 ,
Between 52$ and # -> My data is 05408166 .

Thanks for your reply.

(\d+\$)(.*?)#

See it on Rubular

You will find the first part (eg 8$ ) in the capturing group 1 and the according data in the group 2.

The brackets are responsible, that the result is sotred in those capturing groups. The \\d+ will match at least one digit. The .*? is a lazy match for everything till the next # .

You can split into array based on # . With

String[] entries = data.Split('#');

you will get an arrays with "8$154", "3$021308831", etc.

Now you just work with the entries and split each one at the dollar sign:

String[] tmp = entries[0].Split('$');

So you get

tmp[0] = "8";
tmp[1] = "154";

Build in some checks and you will be happy. No need for regex here I suppose.

If you have "8$15$4#3$021308831" then you will get in tmp :

tmp[0] = "8"; // your key!
tmp[1] = "15"; // data part
tmp[2] = "4"; // data part ($ is missing!)

So you would have to concat all tmp above index 1:

StringBuilder value = new StringBuilder();
for(int i = 1; i < tmp.Length; i++)
{
    if(i > 1) value.Append("$");
    value.Append(tmp[i]);
}
class Program
{
    static void Main(string[] args)
    {
        string text = "8$154#3$021308831#7$NAME SURNAME#11$2166220160#10$5383237309#52$05408166#";
        string[] values = text.Split('$', '#');
        for (var i = 0; i < values.Length - 1; i = i + 2)
        {
            Console.WriteLine("Between " + values[i] + "$ and # -> My data is " + values[i+1]);
        }
        Console.ReadLine();
    }
}

Ok, taking stema 's expression, which works.

using System.Text.RegularExpressions;

string nonParsed = "8$...";

MatchCollection matches = Regex.Matches(nonparsed, @"(\d+\$)(.*?)#");

StringBuilder result = new StringBuilder();

for(int i = 0; i < matches.Count; i++)
{
    Match match = matches[i];

    result.AppendFormat("Between {0} and #-> My data is {1}")
        match.Groups[1].Value,
        match.Groups[2].Value);

    if (i < matches.Count - 1)
    {
        result.AppendLine(",");
    }
    else
    {
        result.Append(".");
    }
}

return result.ToString();

Thanks to stema , this copes with the $ repeating within the value.

如果要使用正则表达式,则应这样做。

\$([\w\d\s]+)\#

这将与betweel $和#匹配:

\$(.*?)#

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM