简体   繁体   English

C#将两个字符串列表与一个模式相交

[英]C# intersect two list of strings with a pattern

I have two list of strings like this: 我有两个这样的字符串列表:

var entities= new List<string>(){"101", "102", "103",}; 
var files= new List<string>(){"101_F05_20101001.csv", "102_F05_20101001.csv", "201_F05_20101001.csv", "202_F05_20101001.csv"};

I want to get the result of intersecting them following this pattern: 我想按照这种模式将它们相交的结果:

ID_F05_YYYYMMDD.csv ID_F05_YYYYMMDD.csv

Where ID should match one or more items in the entities list. ID应与实体列表中的一项或多项匹配的位置。

I have written the following code: 我写了以下代码:

var list = files
    .Where(x => entities.Any(y => x.Contains(y) && x.Substring(0, y.Length) == y))
    .ToList();

Running code here . 在这里运行代码。

But I wonder if this can be improved using a regex like this one: 但是我不知道是否可以使用像这样的正则表达式来改善这种情况:

var regex = new Regex(@"^(\d*)_F05_\d*\.csv$");

Is it possible? 可能吗?

You may use 您可以使用

.Where(x => Regex.IsMatch(x, $@"^(?:{string.Join("|", entities)})_F05_\d*\.csv$"))

The regex will look like ^(?:101|102|103)_F05_\\d*\\.csv$ given your current input data and it will match 给定您当前的输入数据,该正则表达式将类似于^(?:101|102|103)_F05_\\d*\\.csv$

  • ^ - start of string ^ -字符串开头
  • (?:101|102|103) - a non-capturing group that matches 101 , 102 or 103 (?:101|102|103) -一个相匹配的非捕获组101102103
  • _F05_ - a literal string _F05_文字字符串
  • \\d* - 0 or more digits \\d* -0或更多数字
  • \\.csv - .csv string \\.csv - .csv字符串
  • $ - end of string $ -字符串结尾

Note you do not need to regex escape the entities if they are numeric. 注意,如果entities是数字的,则不需要对它们进行正则表达式转义。 Else, you need to use string.Join("|", entities.Select(Regex.Escape)) . 否则,您需要使用string.Join("|", entities.Select(Regex.Escape))

C# code demo : C#代码演示

var entities= new List<string>(){"101", "102", "103",}; 
var files= new List<string>(){"101_F05_20101001.csv", "102_F05_20101001.csv", "201_F05_20101001.csv", "202_F05_20101001.csv"};

var pat = $@"^(?:{string.Join("|", entities)})_F05_\d*\.csv$";

var list = files
        .Where(x => Regex.IsMatch(x, pat))
        .ToList();

foreach (var s in list) {
    Console.WriteLine(s);
}

Output: 输出:

101_F05_20101001.csv
102_F05_20101001.csv

I choose to use a little bit more linq to solve it: 我选择使用更多的linq来解决它:

        var entities = new List<string>() { "101", "102", "103", };
        var files = new List<string>() { "101_F05_20101001.cvs", "102_F05_20101001.cvs", "201_F05_20101001.cvs", "202_F05_20101001.cvs" };
        var regex = new Regex(@"^(\d*)_F05_\d*\.cvs$");

        var result = entities.SelectMany(e => files.Select(f =>
        {
            var match = regex.Match(f);
            if (match.Success)
            {
                if (match.Groups.Count > 1)
                {
                    if (match.Groups[1].Value == e) return f;
                }
            }

            return "";
        })).Where(s => !String.IsNullOrEmpty(s));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM