解析以下文本的最有效方法

Question

Given the following input给定以下输入

@class using System.Text.Json
@attribute using System
@attribute using System.Text
@attribute required Type theType
@attribute optional bool IsReadOnly = false
@attribute optional bool SomethingElse

I need to create the following objects我需要创建以下对象

  new UsingClause(Target.Class, "System.Text.Json");
  new UsingClause(Target.Attribute, "System");
  new UsingClause(Target.Attribute, "System.Text");
  new AttributeProperty(required: true, name: "theType", type: "Type", default: null);
  new AttributeProperty(required: false, name: "IsReadOnly", type: "bool", default: "true");
  new AttributeProperty(required: false, name: "SomethingElse", type: "bool", default: null);

Ordinarily I'd split the strings by spaces into an array etc, but this will be used in a Roslyn code generator, so I need it to be as fast as possible.通常我会用空格将字符串分割成一个数组等，但这将用于 Roslyn 代码生成器，所以我需要它尽可能快。

What approach should I use for this kind of script pre-processing task?对于这种脚本预处理任务，我应该使用什么方法？

I've started down the route of reading each line using a StringReader and then checking it with a compiled Regex .我已经开始使用StringReader读取每一行，然后使用已编译的Regex检查它。 But this whole "ultra efficient" requirement is new to me and I don't want to risk messing it up.但是这整个“超高效”要求对我来说是新的，我不想冒险搞砸它。

    private readonly static Regex Regex = new Regex(
      pattern: @"^\s*((@attribute)\s+(using)\s+(.*))|((@attribute)\s+(optional|required)\s+(\w+[\w\.]*)\s+(\w+)(\s*\=\s*(.*))?)|((@class)\s+(using)\s+(.*))\s*$",
      options: RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled);

Answer 1

It kind of depends on what you consider "efficient" but you should consider writing an ANTLR grammar that would allow the text to be parsed into C# objects which you can then convert to your desired output using something like Antlr4.Runtime.Standard .这有点取决于您认为“高效”的内容，但您应该考虑编写一个 ANTLR 语法，该语法允许将文本解析为 C# 对象，然后您可以使用 Antlr4.Runtime.Standard 之类的东西将其转换为所需的output 对象。

Writing an ANTLR grammar would have the benefit of finding any inconsistencies/ambiguitites in the input 'language'.编写 ANTLR 语法有助于发现输入“语言”中的任何不一致/歧义。 It would also allow you to add/change capabilities in the future.它还允许您在将来添加/更改功能。

解析以下文本的最有效方法

问题描述

1 个解决方案

解决方案1
2 2022-08-06 17:07:08

解析以下文本的最有效方法

问题描述

1 个解决方案

解决方案1 2 2022-08-06 17:07:08

解决方案1
2 2022-08-06 17:07:08