[英]How do I implement words that are not in dictionary must shown with error?
this is a program where a dictionary is implemented with custom regex and it tokenize all every string that is input.这是一个使用自定义正则表达式实现字典的程序,它对输入的所有字符串进行标记。 Now I want that the strings that does not match with any of the regex must be shown in display with "not in grammar" line.
现在我希望与任何正则表达式都不匹配的字符串必须显示在“不在语法中”行的显示中。 I cannot come across with any type of solution.
我找不到任何类型的解决方案。
static void Main(string[] args)
{
string StringRegex = "\"(?:[^\"\\\\]|\\\\.)*\"";
string IntegerRegex = @"[0-9]+";
string CommentRegex = @"//.*|/\*[\s\S]*\*/";
string KeywordRegex = @"\b(?:astart|ainput|atake|aloop|batcommand|batshow|batprint|batmult|batadd|batsub|batdiv|batif|batelse|batgo|batend|till|and)\b";
string DataTypeRegex = @"\b(?:int|string)\b";
string IdentifierRegex = @"[a-zA-Z]";
string ParenthesisRegex = @"\(|\)";
string BracesRegex = @"\{|\}";
string ArrayBracketRegex = @"\[|\]";
string PuncuationRegex = @"\;|\:|\,|\.";
string RelationalExpressionRegex = @"\>|\<|\==";
string ArthimeticOperatorRegex = @"\+|\-|\*|\/";
string WhitespaceRegex = @" ";
Dictionary<string, string> Regexes = new Dictionary<string, string>()
{
{"String", StringRegex},
{"Integer", IntegerRegex },
{"Comment", CommentRegex},
{"Keyword", KeywordRegex},
{"Datatype", DataTypeRegex },
{"Identifier", IdentifierRegex },
{"Parenthesis", ParenthesisRegex },
{"Brace", BracesRegex },
{"Square Bracket", ArrayBracketRegex },
{"Puncuation Mark", PuncuationRegex },
{"Relational Expression", RelationalExpressionRegex },
{"Arithmetic Operator", ArthimeticOperatorRegex },
{"Whitespace", WhitespaceRegex }
};
string input;
input = Convert.ToString(Console.ReadLine());
var matches = Regexes.SelectMany(a => Regex.Matches(input, a.Value)
.Cast<Match>()
.Select(b =>
new
{
Value = b.Value + "\n",
Index = b.Index,
Token= a.Key
}))
.OrderBy(a => a.Index).ToList();
for (int i = 0; i < matches.Count; i++)
{
if (i + 1 < matches.Count)
{
int firstEndPos = (matches[i].Index + matches[i].Value.Length);
if (firstEndPos > matches[(i + 1)].Index)
{
matches.RemoveAt(i + 1);
i--;
}
}
}
foreach (var match in matches)
{
Console.WriteLine(match);
}
Console.ReadLine();
}
The identifier regex should be changed to标识符正则表达式应更改为
var IdentifierRegex = @"\b[a-zA-Z]\b";
Then, asdasdas
will not match and you will be able to test for an empty result, eg.然后,
asdasdas
将不匹配,您将能够测试空结果,例如。
if (matches.Count == 0)
Console.WriteLine("Not in grammar");
else
{ ... }
See this IDEONE demo .请参阅此 IDEONE 演示。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.