简体   繁体   中英

How to find all the words starting with '$' sign and ending with space, in a long string?



var matches = Regex.Matches(input, "(\\$\\w+) ");

In the above, \\\\w matches word characters. These are AZ, az, - and _ if I'm correct. If you want to match everything that's not a space, you can use \\\\S . If you want a specific set, specify this through eg [a-zA-Z0-9] .

The brackets around the (\\\\$\\\\w+) ensures that of a specific match, matches[0].Groups[1].Value; gives the value inside the backets (so, excluding the trailing space).

As a complete example:

string input = "$a1 $a2 $b1 $b2";

foreach (Match match in Regex.Matches(input, "(\\$\\w+) "))

This produces the following output:


The $b2 is of course omitted because it does not have a trailing space.

You may try it without regular expressions, that may be faster.

string longText = "";
    List<string> found = new List<string>();
    foreach (var item in longText.Split(' '))
        if (item.StartsWith("$"))

EDIT: After Zain Shaikh's comment I've written a simple program to benchmark, here goes the results.

        string input = "$a1 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2 $a2 $b1 $b2";
        var s1 = Stopwatch.StartNew();
        double first;
        foreach (Match match in Regex.Matches(input, "(\\$\\w+) "))
        Console.WriteLine(" 1) " + (s1.Elapsed.TotalMilliseconds * 1000 * 1000).ToString("0.00 ns"));
        first = s1.Elapsed.TotalMilliseconds;

        s1 = Stopwatch.StartNew();

        foreach (var item in input.Split(' '))
            if (item.StartsWith("$"))
        Console.WriteLine(" 2) " + (s1.Elapsed.TotalMilliseconds * 1000 * 1000).ToString("0.00 ns"));
        Console.WriteLine(s1.Elapsed.TotalMilliseconds - first);


1) 730600.00 ns

2)  53000.00 ns


That means string functions (also with foreach) are faster than regular expression functions ;)

var a1 = "fdjksf $jgjkd $hfj".Split(" ".ToCharArray())
                                     .Where(X=>Regex.Match(X , "(\\$[a-zA-Z]*)").Success);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM