简体   繁体   English

C# - 从内容改变的字符串中提取子字符串

[英]C# -Extract substrings from string with change in content

(WHERE)
  (CONDITION OPERATOR="AND")  
   (EXPRESSION NAME="abc" ATTRIBUTE="minor")
   (VALUE)m1(/VALUE)
   (/EXPRESSION)

  (EXPRESSION NAME="abc" ATTRIBUTE="ID")
  (VALUE)ID(/VALUE)
  (/EXPRESSION)

  (EXPRESSION NAME="abc" ATTRIBUTE="major")
  (VALUE)m2(/VALUE)
  (/EXPRESSION)

(/CONDITION)     
(/WHERE)

How can i get 3 substrings from the string as maybe minor = the first substring with attribute = "minor" , then string Id= the next substring with attribute Id and so on, as the the expression name may change and i cannot use the string as a whole to get the value of ID in (VALUE)ID(/VALUE) .我怎样才能从字符串中获取 3 个子字符串,因为可能是 minor = 第一个带有attribute = "minor" substring ,然后字符串Id=带有属性Id的下一个子字符串等等,因为表达式名称可能会改变,我不能使用该字符串作为一个整体来获得的值ID(VALUE)ID(/VALUE) Hope my question is clear.希望我的问题很清楚。

Your input have a regular structure so it is possible to convert it to xml:您的输入具有常规结构,因此可以将其转换为 xml:

<WHERE>
  <CONDITION OPERATOR="AND">
    <EXPRESSION NAME="abc" ATTRIBUTE="minor">
      <VALUE>m1</VALUE>
    </EXPRESSION>
    <EXPRESSION NAME="abc" ATTRIBUTE="ID">
      <VALUE>ID</VALUE>
    </EXPRESSION>
    <EXPRESSION NAME="abc" ATTRIBUTE="major">
      <VALUE>m2</VALUE>
    </EXPRESSION>
  </CONDITION>
</WHERE>

and then query it with xpath like //EXPRESSION[@ATTRIBUTE='major']/*[1]然后使用 xpath 查询它,如//EXPRESSION[@ATTRIBUTE='major']/*[1]

While simple string.Replace may work I think it would be better to replace only braces that not inside attribute values.虽然简单的string.Replace可能会起作用,但我认为最好只替换不在属性值内的大括号。 You can use a regular expression to find strings:您可以使用正则表达式来查找字符串:

"([^"\\]|\\.)*"

and extract strings bounds:并提取字符串边界:

var stringsBounds = Regex.Matches(input, "\"([^\"\\\\]|\\\\.)*\"")
    .Cast<Match>()
    .Select(m => new
    {
        begin = m.Index,
        end = m.Index + m.Length - 1
    })
    .ToArray();

with this bounds you can do smart replacing:有了这个界限,您可以进行智能替换:

Func<Match, bool> isInsideString = m => stringsBounds.Any(b => m.Index > b.begin && m.Index < b.end);
var xmlAsText = Regex.Replace(Regex.Replace(input, "\\(", m => isInsideString(m) ? "(" : "<"),
    "\\)", m => isInsideString(m) ? ")" : ">");

Now you are ready to query your xml:现在您已准备好查询您的 xml:

var xml = XDocument.Parse(xmlAsText);

var expressionSelector = "//EXPRESSION[@ATTRIBUTE='{0}']/*[1]";

foreach (var attribute in new [] {"minor", "major", "ID"})
{
    var xpath = string.Format(expressionSelector, attribute);
    var node = xml.XPathSelectElement(xpath);

    Console.WriteLine($"Attribute: {attribute}, element: {node}");
}

You can try it online你可以上网试试

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM