简体   繁体   English

在C#.NET中使用Regex从字符串中提取数据

[英]Using Regex in C#.NET to extract data from a string

I am trying to extract the values AGST, 1, or 2 from: 我试图从中提取值AGST,1或2:

string order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

I have found a close answer on stack overflow here . 我在这里找到了关于堆栈溢出的接近答案。

I would like to use a method similar to the top solution at the above link, but I think I have to change the regular expression in the line: 我想在上面的链接中使用与顶级解决方案相似的方法,但是我认为我必须更改该行中的正则表达式:

var pattern = string.Format("(?<=[\\{{\\s,]{0}\\s*=\\s*)\\d+", escIdPart);

Any help is greatly appreciated! 任何帮助是极大的赞赏!

Edit: 编辑:

Thanks for the help! 谢谢您的帮助! Here is my current code - 这是我当前的代码-

protected void Page_Load(object sender, EventArgs e)
{
    var order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

    var id = GetValue("ID:", order); // should return "AGST"
    var quantity = GetValue("Quantity:", order); // should return "1"

    Label3.Text = id.ToString();
    Label4.Text = quantity.ToString();
}

public string GetValue(string idPart, string test)
{
    var escIdPart = Regex.Escape(idPart);
    var pattern = string.Format(@": (.+)?,.*: (\d+).*(\d+)", escIdPart);
    var result = default(string);
    var match = Regex.Match(test, pattern, RegexOptions.IgnoreCase);
    if (match.Success)
    {
        result = match.Value;
    }
    return result;
}

id.ToString() and quantity.ToString() both produce ": AGST, Quantity: 1, Points each: 2" when they should produce "AGST" and "1" respectively. id.ToString()和quantity.ToString()都应分别产生“ AGST”和“ 1”时产生“:AGST,数量:1,每个点数:2”。

Again, any help is appreciated! 再次感谢您的帮助!

Edit 2: Solved! 编辑2:解决了!

Thanks for all the help! 感谢您的所有帮助! Here is my final code: 这是我的最终代码:

protected void Page_Load(object sender, EventArgs e)
{
    var order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

    var id = GetValue(1, order); // returns "AGST"
    var quantity = GetValue(2, order); // returns "1"
    var cost = GetValue(3, order); // returns "2"

    Label3.Text = id.ToString();
    Label4.Text = quantity.ToString();
    Label5.Text = cost.ToString();
}

public string GetValue(int group, string test)
{
    var pattern = @": (.+)?,.*: (\d+).*(\d+)";
    var result = default(string);
    var match = Regex.Match(test, pattern, RegexOptions.IgnoreCase);
    if (match.Success)
    {
        result = match.Groups[group].Value;
    }
    return result;
}

Edit 3: "var pattern" expression change 编辑3:“变量模式”表达式更改

I found that the expression only works if the value after "Points each: " is one digit. 我发现该表达式仅在“ Points each:”之后的值是一位数字时才有效。 I changed the expression and now it seems to work fine with any number of digits in the values following "Quantity: " and "Points each: " - any objections/suggestions? 我更改了表达式,现在看来“数量:”和“每个要点:”之后的值中的任意数量的数字都可以正常工作-是否有任何异议/建议? Here is the code: 这是代码:

protected void Page_Load(object sender, EventArgs e)
{
    var order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

    var id = GetValue(1, order); // returns "AGST"
    var quantity = GetValue(2, order); // returns "1"
    var cost = GetValue(3, order); // returns "2"

    Label3.Text = id.ToString();
    Label4.Text = quantity.ToString();
    Label5.Text = cost.ToString();
}

public string GetValue(int group, string test)
{
    var pattern = @": (.+)?,.*: (\d+).*: (\d+)";
    var result = default(string);
    var match = Regex.Match(test, pattern, RegexOptions.IgnoreCase);
    if (match.Success)
    {
        result = match.Groups[group].Value;
    }
    return result;
}

I think you can simplify the expression a bit, consider the following: 我认为您可以简化一下表达式,请考虑以下几点:

: (.+)?,.*: (\d+).*(\d+)

正则表达式可视化

Debuggex Demo Debuggex演示

Try to use next regex 尝试使用下一个正则表达式

"ID: (.*), Quantity: (.*), Points each: (.*)\)"

After that you can get AGST from group1, 1 from group2 and 2 from group3 之后,您可以从组1获取AGST,从组2获取1,从组3获取2

using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string pattern = @"ID\s?\:\s(?<id>\w+).+Quantity\s?\:\s(?<quantity>\d+).+each\s?\:\s(?<points>\d+)";
        string input = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";

        Regex regex = new Regex(pattern);
        Match m = regex.Match(input);

        if(m.Success)
        {
            string id = m.Groups["id"].Value;
            int quantity = Int32.Parse(m.Groups["quantity"].Value);
            int points = Int32.Parse(m.Groups["points"].Value);

            Console.WriteLine(id + ", " + quantity + ", " + points);
        }
    }
}

See example on DotNetFiddle . 请参阅DotNetFiddle上的示例

Here is a string-method only approach: 这是仅基于字符串方法的方法:

string order = "Tumbler (ID: AGST, Quantity: 1, Points each: 2)";
var keyValues = new List<KeyValuePair<string, string>>();

int index = order.IndexOf('(');
if(index++ >= 0)
{
    int endIndex = order.IndexOf(')', index);
    if(endIndex >= 0)
    {
        string inBrackets = order.Substring(index, endIndex - index);
        string[] tokens = inBrackets.Trim().Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);

        foreach(string token in tokens)
        {
            string[] keyVals = token.Trim().Split(new[]{':'}, StringSplitOptions.RemoveEmptyEntries);
            if(keyVals.Length == 2)
            {
                keyValues.Add(new KeyValuePair<string,string>(keyVals[0].Trim(), keyVals[1].Trim()));
            }
        }
    }
}
foreach (var keyVal in keyValues)
{
    Console.WriteLine("{0} {1}", keyVal.Key, keyVal.Value);
}

What about this? 那这个呢?

var pattern = @"^Tumbler \(ID\: ([A-Z]+), Quantity\: (\d+), Points each\: (\d+)\)";
var regex = new Regex(pattern);
var match = regex.Match("Tumbler (ID: AGST, Quantity: 1, Points each: 2)");
foreach (var group in match.Groups)
{
    Console.WriteLine(group.ToString());
}

Output should be: 输出应为:

Tumbler (ID: AGST, Quantity: 1, Points each: 2)
AGST
1
2

I am assuming some uniformity across other target strings insofar as alpha IDs and whole-number quantities & points, but you can adjust as needed. 我假设其他目标字符串在字母ID和整数数量与点方面具有一定的一致性,但是您可以根据需要进行调整。

MSDN has some great reference info and examples to help. MSDN提供了一些很好的参考信息和示例以帮助您。

Also, check out Regex Hero's online tester to tinker - with IntelliSense even. 此外,甚至可以使用IntelliSense来查看Regex Hero的在线测试仪以进行修补。 :) You can tinker with the a copy of the above regex pattern that I saved there. :)您可以修改上面保存的上述正则表达式模式的副本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM