简体   繁体   English

c#正则表达式匹配示例

[英]c# regex matches example

I am trying to get values from the following text.我正在尝试从以下文本中获取值。 How can this be done with Regex?这如何用 Regex 完成?

Input输入

Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. Lorem ipsum dolor sat %download%#456 amet, consectetur adipiscing %download%#3434 elit。 Duis non nunc nec mauris feugiat porttitor. Duis non nunc nec mauris feugiat porttitor。 Sed tincidunt blandit dui a viverra%download%#298. sed tincidunt blandit dui a viverra%download%#298。 Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit。

Output输出

456  
3434  
298   
893434 

So you're trying to grab numeric values that are preceded by the token "%download%#"?因此,您试图获取以标记“%download%#”开头的数值?

Try this pattern:试试这个模式:

(?<=%download%#)\d+

That should work.那应该工作。 I don't think # or % are special characters in .NET Regex, but you'll have to either escape the backslash like \\\\ or use a verbatim string for the whole pattern:我不认为#%是 .NET Regex 中的特殊字符,但是您必须像\\\\一样转义反斜杠,或者对整个模式使用逐字字符串

var regex = new Regex(@"(?<=%download%#)\d+");
return regex.Matches(strInput);

Tested here: http://rextester.com/BLYCC16700在这里测试: http : //rextester.com/BLYCC16700

NOTE: The lookbehind assertion (?<=...) is important because you don't want to include %download%# in your results, only the numbers after it.注意: lookbehind 断言(?<=...)很重要,因为您不想在结果中包含%download%# ,只包含它后面的数字。 However, your example appears to require it before each string you want to capture.但是,您的示例似乎在您要捕获的每个字符串之前都需要它。 The lookbehind group will make sure it's there in the input string, but won't include it in the returned results.后视组将确保它存在于输入字符串中,但不会将其包含在返回的结果中。 More on lookaround assertions here.更多关于环视断言的信息在这里。

All the other responses I see are fine, but C# has support for named groups!我看到的所有其他响应都很好,但 C# 支持命名组!

I'd use the following code:我会使用以下代码:

const string input = "Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.";

static void Main(string[] args)
{
    Regex expression = new Regex(@"%download%#(?<Identifier>[0-9]*)");
    var results = expression.Matches(input);
    foreach (Match match in results)
    {
        Console.WriteLine(match.Groups["Identifier"].Value);
    }
}

The code that reads: (?<Identifier>[0-9]*) specifies that [0-9]* 's results will be part of a named group that we index as above: match.Groups["Identifier"].Value代码如下: (?<Identifier>[0-9]*)指定[0-9]*的结果将是我们如上索引的命名组的一部分: match.Groups["Identifier"].Value

public void match2()
{
    string input = "%download%#893434";
    Regex word = new Regex(@"\d+");
    Match m = word.Match(input);
    Console.WriteLine(m.Value);
}

It looks like most of post here described what you need here.看起来这里的大部分帖子都描述了你在这里需要的东西。 However - something you might need more complex behavior - depending on what you're parsing.但是 - 您可能需要更复杂的行为 - 取决于您要解析的内容。 In your case it might be so that you won't need more complex parsing - but it depends what information you're extracting.在您的情况下,您可能不需要更复杂的解析 - 但这取决于您要提取的信息。

You can use regex groups as field name in class, after which could be written for example like this:您可以在类中使用正则表达式组作为字段名称,之后可以这样写:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.RegularExpressions;

public class Info
{
    public String Identifier;
    public char nextChar;
};

class testRegex {

    const string input = "Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. " +
    "Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.";

    static void Main(string[] args)
    {
        Regex regex = new Regex(@"%download%#(?<Identifier>[0-9]*)(?<nextChar>.)(?<thisCharIsNotNeeded>.)");
        List<Info> infos = new List<Info>();

        foreach (Match match in regex.Matches(input))
        {
            Info info = new Info();
            for( int i = 1; i < regex.GetGroupNames().Length; i++ )
            {
                String groupName = regex.GetGroupNames()[i];

                FieldInfo fi = info.GetType().GetField(regex.GetGroupNames()[i]);

                if( fi != null ) // Field is non-public or does not exists.
                    fi.SetValue( info, Convert.ChangeType( match.Groups[groupName].Value, fi.FieldType));
            }
            infos.Add(info);
        }

        foreach ( var info in infos )
        {
            Console.WriteLine(info.Identifier + " followed by '" + info.nextChar.ToString() + "'");
        }
    }

};

This mechanism uses C# reflection to set value to class.此机制使用 C# 反射来为类设置值。 group name is matched against field name in class instance.组名与类实例中的字段名匹配。 Please note that Convert.ChangeType won't accept any kind of garbage.请注意 Convert.ChangeType 不接受任何类型的垃圾。

If you want to add tracking of line / column - you can add extra Regex split for lines, but in order to keep for loop intact - all match patterns must have named groups.如果要添加行/列的跟踪 - 您可以为行添加额外的 Regex 拆分,但为了保持 for 循环完整 - 所有匹配模式都必须具有命名组。 (Otherwise column index will be calculated incorrectly) (否则列索引会计算错误)

This will results in following output:这将导致以下输出:

456 followed by ' '
3434 followed by ' '
298 followed by '.'
893434 followed by ' '
Regex regex = new Regex("%download#(\\d+?)%", RegexOptions.SingleLine);
Matches m = regex.Matches(input);

I think will do the trick (not tested).我认为可以解决问题(未测试)。

This pattern should work:这种模式应该有效:

#\d

foreach(var match in System.Text.RegularExpressions.RegEx.Matches(input, "#\d"))
{
    Console.WriteLine(match.Value);
}

(I'm not in front of Visual Studio, but even if that doesn't compile as-is, it should be close enough to tweak into something that works). (我不在 Visual Studio 前面,但即使它不能按原样编译,它也应该足够接近以调整为有效的东西)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM