简体   繁体   中英

Using Regex in C#.NET to extract data from a string

I am trying to extract the values AGST, 1, or 2 from:

string order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

I have found a close answer on stack overflow here .

I would like to use a method similar to the top solution at the above link, but I think I have to change the regular expression in the line:

var pattern = string.Format("(?<=[\\{{\\s,]{0}\\s*=\\s*)\\d+", escIdPart);

Any help is greatly appreciated!

Edit:

Thanks for the help! Here is my current code -

protected void Page_Load(object sender, EventArgs e)
{
    var order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

    var id = GetValue("ID:", order); // should return "AGST"
    var quantity = GetValue("Quantity:", order); // should return "1"

    Label3.Text = id.ToString();
    Label4.Text = quantity.ToString();
}

public string GetValue(string idPart, string test)
{
    var escIdPart = Regex.Escape(idPart);
    var pattern = string.Format(@": (.+)?,.*: (\d+).*(\d+)", escIdPart);
    var result = default(string);
    var match = Regex.Match(test, pattern, RegexOptions.IgnoreCase);
    if (match.Success)
    {
        result = match.Value;
    }
    return result;
}

id.ToString() and quantity.ToString() both produce ": AGST, Quantity: 1, Points each: 2" when they should produce "AGST" and "1" respectively.

Again, any help is appreciated!

Edit 2: Solved!

Thanks for all the help! Here is my final code:

protected void Page_Load(object sender, EventArgs e)
{
    var order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

    var id = GetValue(1, order); // returns "AGST"
    var quantity = GetValue(2, order); // returns "1"
    var cost = GetValue(3, order); // returns "2"

    Label3.Text = id.ToString();
    Label4.Text = quantity.ToString();
    Label5.Text = cost.ToString();
}

public string GetValue(int group, string test)
{
    var pattern = @": (.+)?,.*: (\d+).*(\d+)";
    var result = default(string);
    var match = Regex.Match(test, pattern, RegexOptions.IgnoreCase);
    if (match.Success)
    {
        result = match.Groups[group].Value;
    }
    return result;
}

Edit 3: "var pattern" expression change

I found that the expression only works if the value after "Points each: " is one digit. I changed the expression and now it seems to work fine with any number of digits in the values following "Quantity: " and "Points each: " - any objections/suggestions? Here is the code:

protected void Page_Load(object sender, EventArgs e)
{
    var order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

    var id = GetValue(1, order); // returns "AGST"
    var quantity = GetValue(2, order); // returns "1"
    var cost = GetValue(3, order); // returns "2"

    Label3.Text = id.ToString();
    Label4.Text = quantity.ToString();
    Label5.Text = cost.ToString();
}

public string GetValue(int group, string test)
{
    var pattern = @": (.+)?,.*: (\d+).*: (\d+)";
    var result = default(string);
    var match = Regex.Match(test, pattern, RegexOptions.IgnoreCase);
    if (match.Success)
    {
        result = match.Groups[group].Value;
    }
    return result;
}

I think you can simplify the expression a bit, consider the following:

: (.+)?,.*: (\d+).*(\d+)

正则表达式可视化

Debuggex Demo

Try to use next regex

"ID: (.*), Quantity: (.*), Points each: (.*)\)"

After that you can get AGST from group1, 1 from group2 and 2 from group3

using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string pattern = @"ID\s?\:\s(?<id>\w+).+Quantity\s?\:\s(?<quantity>\d+).+each\s?\:\s(?<points>\d+)";
        string input = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

        Regex regex = new Regex(pattern);
        Match m = regex.Match(input);

        if(m.Success)
        {
            string id = m.Groups["id"].Value;
            int quantity = Int32.Parse(m.Groups["quantity"].Value);
            int points = Int32.Parse(m.Groups["points"].Value);

            Console.WriteLine(id + ", " + quantity + ", " + points);
        }
    }
}

See example on DotNetFiddle .

Here is a string-method only approach:

string order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";
var keyValues = new List<KeyValuePair<string, string>>();

int index = order.IndexOf('(');
if(index++ >= 0)
{
    int endIndex = order.IndexOf(')', index);
    if(endIndex >= 0)
    {
        string inBrackets = order.Substring(index, endIndex - index);
        string[] tokens = inBrackets.Trim().Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);

        foreach(string token in tokens)
        {
            string[] keyVals = token.Trim().Split(new[]{':'}, StringSplitOptions.RemoveEmptyEntries);
            if(keyVals.Length == 2)
            {
                keyValues.Add(new KeyValuePair<string,string>(keyVals[0].Trim(), keyVals[1].Trim()));
            }
        }
    }
}
foreach (var keyVal in keyValues)
{
    Console.WriteLine("{0} {1}", keyVal.Key, keyVal.Value);
}

What about this?

var pattern = @"^Tumbler \(ID\: ([A-Z]+), Quantity\: (\d+), Points each\: (\d+)\)";
var regex = new Regex(pattern);
var match = regex.Match("Tumbler (ID: AGST, Quantity: 1, Points each: 2)");
foreach (var group in match.Groups)
{
    Console.WriteLine(group.ToString());
}

Output should be:

Tumbler (ID: AGST, Quantity: 1, Points each: 2)
AGST
1
2

I am assuming some uniformity across other target strings insofar as alpha IDs and whole-number quantities & points, but you can adjust as needed.

MSDN has some great reference info and examples to help.

Also, check out Regex Hero's online tester to tinker - with IntelliSense even. :) You can tinker with the a copy of the above regex pattern that I saved there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM