[英]How to Regex match a pattern with parentheses in C#
Background: I'm doing some complicated code generation that requires me to extract the methods within a C# interface file. 背景:我正在做一些复杂的代码生成,需要我在C#接口文件中提取方法。 I cannot simply use reflection because this code will feed a T4 template which will not have the compiled code to reflect upon.
我不能简单地使用反射,因为此代码将提供一个T4模板,而该模板将没有要反映的已编译代码。 Thus I am attempting parsing.
因此,我正在尝试解析。 I can easily make my own parser, but it would be nice if there was a regular expression solution.
我可以轻松地创建自己的解析器,但是如果有一个正则表达式解决方案会很好。
Question: Is-there/What regex pattern would match the method declarations (including the return types and parameters) of the string below using C#'s Regular Expressions library? 问题:使用C#的正则表达式库,是否/哪种正则表达式模式与下面的字符串的方法声明(包括返回类型和参数)匹配?
string testing = @"
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
public interface Service
{
int Test1(int a);
int Test2(int a, int b);
int Test3(
int a,
int b);
int Test4(out int a);
}
}
";
The regex pattern I desire should make four matches: 我希望的正则表达式模式应该匹配四个:
Solution Attempt: Here is possibly the closest I have come to a regex solution thus far: 解决方案尝试:这可能是到目前为止我最接近的正则表达式解决方案:
string WhiteSpacePattern = @"\s+";
string PossibleWhiteSpacePattern = @"\s*";
string CsharpWordPattern = @"[a-zA-Z_]+";
string ParenthesesPattern = @"[(][\s\S]*?[)]";
string DoubleCsharpWordPattern = CsharpWordPattern + WhiteSpacePattern + CsharpWordPattern;
string MethodDeclarationPattern =
DoubleCsharpWordPattern +
PossibleWhiteSpacePattern +
ParenthesesPattern;
Pattern usage example: 模式用法示例:
MatchCollection tests = Regex.Matches(testing, MethodDeclarationPattern);
The individual patterns work perfectly (CsharpWordPattern, ParenthesesPattern, WhiteSpacePattern, and PossibleWhiteSpacePattern). 各个模式可以完美地工作(CsharpWordPattern,括号模式,WhiteSpacePattern和可能的WhiteSpacePattern)。 However, when I put them altogether into a single pattern (MethodDeclarationPattern), the full pattern is failing.
但是,当我将它们完全放在一个模式(MethodDeclarationPattern)中时,完整模式失败了。
How does MethodDeclarationPattern or my usage example need to be altered so that it will start matching the method declarations in the interface code? 如何更改MethodDeclarationPattern或我的用法示例,以使其开始与接口代码中的方法声明匹配?
To match literal parens, escape them with backslashes: 要匹配文字括号,请使用反斜杠对其进行转义:
string ParenthesesPattern = @"\([\s\S]*?\)";
That regex snippet matches a matched pair of parentheses, with optional whitespace between them. 该正则表达式代码段匹配一对匹配的括号,并且括号之间有可选的空格。 You're putting it at the end of your overall regex.
您将其放在整个正则表达式的末尾。
Your complete concatenated regex looks like this: 您完整的串联正则表达式如下所示:
[a-zA-Z_]+\s+[a-zA-Z_]+\s*[(][\s\S]*?[)]
Identifier, space, identifier, open paren, space, close paren. 标识符,空格,标识符,打开括号,空格,关闭括号。
For that to match, the method declaration will have to look like this: 为此,方法声明必须如下所示:
"int foo ()"
I believe you'll have better success with something like this: 我相信您将通过以下方式获得更好的成功:
string openParenPattern = @"\([\s\S]*?";
string closeParenPattern = @"[\s\S]*?\)";
What you really need, conceptually, is this (leaving out space -- no need to clutter it up with that): 从概念上讲,您真正需要的是(节省空间-无需将其弄乱):
You know all the syntax for that, I think. 我想,您知道所有语法。 You'll have nested groups.
您将有嵌套的组。 Looking at it, I'm really starting to warm up to your idea of putting sub-regexes in string variables and then concatenating them.
看着它,我真的开始热衷于您的想法,即将子正则表达式放入字符串变量中,然后将它们串联起来。
The following code matches all four method declarations in your test string: 以下代码匹配测试字符串中的所有四个方法声明:
// This has one bug: It matches "int foo(int a,)"
// Somebody good with regexes could fix that.
var methodPattern =
// return type
identPattern + spacePattern
// method name
+ identPattern + spacePattern
// open paren
+ openParenPattern + spacePattern
// Zero or more parameters followed by commas
+ "(" + paramPattern + spacePattern + "," + spacePattern + ")*" + spacePattern
// Final (or only) parameter not followed by a comma
+ "(" + paramPattern + spacePattern + ")?" + spacePattern
// Close paren
+ closeParenPattern;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.