C＃正则表达式：如何提取集合

Question

I have collection in text file: 我在文本文件中有集合：

(Collection
  (Item "Name1" 1 2 3)
  (Item "Simple name2" 1 2 3)
  (Item "Just name 3" 4 5 6))

Collection also could be empty: 集合也可以为空：

(Collection)

The number of items is undefined. 项目数量未定义。 It could be one item or one hundred. 它可以是一项或一百项。 By previous extraction I already have inner text between Collection element: 通过前面的提取，我已经在Collection元素之间有了内部文本：

(Item "Name1" 1 2 3)(Item "Simple name2" 1 2 3)(Item "Just name 3" 4 5 6)

In the case of empty collection it will be empty string. 如果是空集合，它将是空字符串。

How could I parse this collection using .Net Regular Expression? 如何使用.Net正则表达式解析此集合？

I tried this: 我试过这个：

string pattern = @"(\(Item\s""(?<Name>.*)""\s(?<Type>.*)\s(?<Length>.*)\s(?<Number>.*))*";

But the code above doesn't produce any real results. 但是上面的代码不会产生任何实际结果。

UPDATE: 更新：

I tried to use regex differently: 我试图以不同的方式使用正则表达式：

foreach (Match match in Regex.Matches(document, pattern, RegexOptions.Singleline))
{
    for (int i = 0; i < match.Groups["Name"].Captures.Count; i++)
    {
        Console.WriteLine(match.Groups["Name"].Captures[i].Value);
    }
}

or 要么

while (m.Success)
{
    m.Groups["Name"].Value.Dump();
    m.NextMatch();
}

Answer 1

Try 尝试

\(Item (?<part1>\".*?\")\s(?<part2>\d+)\s(?<part3>\d+)\s(?<part4>\d+)\)

this will create a collection of matches: 这将创建一个匹配项集合：

Regex regex = new Regex(
      "\\(Item (?<part1>\\\".*?\\\")\\s(?<part2>\\d+)\\s(?<part3>\\d"+
      "+)\\s(?<part4>\\d+)\\)",
    RegexOptions.Multiline | RegexOptions.Compiled
    );

//Capture all Matches in the InputText
MatchCollection ms = regex.Matches(InputText);


//Get the names of all the named and numbered capture groups
string[] GroupNames = regex.GetGroupNames();

// Get the numbers of all the named and numbered capture groups
int[] GroupNumbers = regex.GetGroupNumbers();

Answer 2

I think you might need to make your captures non-greedy... 我认为您可能需要使拍摄内容不再贪婪...

(?<Name>.*?)

instead of 代替

(?<Name>.*)

Answer 3

I think you should read file and than make use of Sting.Split function to split the collection and start to read it 我认为您应该阅读文件，而不是使用Sting.Split函数来拆分集合并开始阅读它

   String s = "(Collection
              (Item "Name1" 1 2 3)
              (Item "Simple name2" 1 2 3)
              (Item "Just name 3" 4 5 6))";

   string colection[] = s.Split('(');
   if(colection.Length>1)
   {
      for(i=1;i<colection.Length;i++)
      {
          //process string one by one and add ( if you need it
          //from the last item remove )
      }
   }

this will resolve issue easily there is no need of put extra burden of regulat expression. 这将很容易解决问题，无需增加额外的规范表达负担。

C＃正则表达式：如何提取集合

问题描述

3 个解决方案

解决方案1
3 已采纳 2011-09-30 10:30:51

解决方案2
2 2011-09-30 10:23:29

解决方案3
2 2011-09-30 10:27:23

C＃正则表达式：如何提取集合

问题描述

3 个解决方案

解决方案1 3 已采纳 2011-09-30 10:30:51

解决方案2 2 2011-09-30 10:23:29

解决方案3 2 2011-09-30 10:27:23

解决方案1
3 已采纳 2011-09-30 10:30:51

解决方案2
2 2011-09-30 10:23:29

解决方案3
2 2011-09-30 10:27:23