简体   繁体   English

C#正则表达式与输入字符串不正确匹配

[英]C# Regex doesn't correctly match the input string

I'm working on an ASP.NET form application that takes in a master course ID from user input and matches it against a format. 我正在使用一个ASP.NET表单应用程序,该应用程序从用户输入中获取一个主课程ID,并将其与一种格式进行匹配。 The format looks like this: 格式如下:

HIST-1302-233IN-FA2012

or it could be 或者可能是

XL-HIST-1302-233IN-FA2012

Here is my regex: 这是我的正则表达式:

string masterCourseRegex = @"(.{4}-.{4}-.{5}-.{6})/|XL-(.{4}-.{4}-.{5}-.{6})";

I've tested this in Rubular without the forward escape before the XL and it seems to work for both formats. 我已经在Rubular上测试了此代码,而没有XL之前的前向转义符,它似乎适用于两种格式。 But in my testing of my web app, the code seems to think that HIST-1302-233IN-FA2012 doesn't match and so it follows the path of the code indicating that the course ID didn't match the specified format thus throwing the message of "invalid course ID format" when it ought to be matching just fine and going onto code that will actually use it. 但是在测试我的Web应用程序时,该代码似乎认为HIST-1302-233IN-FA2012不匹配,因此它遵循代码的路径,表明课程ID与指定的格式不匹配,因此抛出消息“无效的课程ID格式”,应该正确匹配并转到实际使用它的代码上。

My form correctly recognizes when something has the XL- in front of it and continues to process as usual, I just have an issue with the standard format without the XL. 我的表单可以正确识别何时在其前面有XL-并继续照常进行处理,而在没有XL的标准格式下我只是遇到问题。 Here is my code: 这是我的代码:

if (!Regex.IsMatch(txtBoxMasterCourse.Text, masterCourseRegex))
                {
                    string msg = string.Empty;
                    StringBuilder sb = new StringBuilder();
                    sb.Append("alert('The course ID " + txtBoxMasterCourse.Text + " did not match the naming standards for Blackboard course IDs. Please be sure to use the correct naming convention as specified on the form in the example.");
                    sb.Append(msg.Replace("\n", "\\n").Replace("\r", "").Replace("'", "\\'"));
                    sb.Append("');");
                    ScriptManager.RegisterStartupScript(this.Page, this.GetType(), "showalert", sb.ToString(), true);
                }

I can't see anything wrong that is readily apparent to me and would appreciate your input. 我看不到任何对我来说显而易见的错误,并感谢您的投入。

Thanks! 谢谢!

If we break down your expression and add some comments it is easier to see the problem. 如果我们分解您的表达方式并添加一些评论,则更容易发现问题。

string masterCourseRegex = @"
   (    # Capture
    .{4}  # Match any character, exactly four times
    -     # Match a single hyphen/minus
    .{4}  # Match any character, exactly four times
    -     # Match a single hyphen/minus
    .{5}  # Match any character, exacly five times.
    -     # Match a single hyphen/minus
    .{6}  # Match any character, exactly six times
   )    # End Capture
   /    # Match a single forward slash <----------- HERE IS THE PROBLEM
   |    # OR
   XL   # Match the characters XL
   -    # Match a single forward slash
   (
   .{4}   # Match any character, exactly four times
   -      # Match a single hyphen/minus
   .{4}   # Match any character, exactly four times
   -      # Match a single hyphen/minus
   .{5}   # Match any character, exactly five times
   -      # Match a single hyphen/minus
   .{6}   # Match any character, exactly six times
   )"

Removing the forward slash from your original expression will allow it to match both of your examples. 从原始表达式中删除正斜杠将使其与您的两个示例都匹配。

string masterCourseRegex = @"(.{4}-.{4}-.{5}-.{6})|XL-(.{4}-.{4}-.{5}-.{6})";

Alternatively, you may want to consider making the expression more specific by eliminating the use of the . 另外,您可能需要考虑通过消除使用来使表达式更加具体. matches. 火柴。 For example: 例如:

string masterCourseRegex = @"(XL-)?(\w{4}-\d{4}-[\w\d]{5}-[\w\d]{6})";

Which also works against your given examples of "HIST-1302-233IN-FA2012" and "XL-HIST-1302-233IN-FA2012" . 这也适用于您给定的"HIST-1302-233IN-FA2012""XL-HIST-1302-233IN-FA2012"

It's generally a good practice to be as specific as possible in a regular expression. 通常,在正则表达式中尽可能具体是一种很好的做法。 Remember that the . 记住那个. operator matches any character, and it's use can make debugging a regular expression more difficult than it needs to. 运算符可以匹配任何字符,并且使用它会使调试正则表达式变得更加困难。

Don't get all fancy. 别幻想。 Try something like: 尝试类似:

static Regex rx = new Regex( @"
  ^                     # start-of-text
  (XL-)?                # followed by an optional "XL-" prefix
  [A-Z][A-Z][A-Z][A-Z]  # followed by 4 letters
  -                     # followed by a literal hyphen ("-")
  \d\d\d\d              # followed by 4 decimal digits
  -                     # followed by a literal hyphen ("-")
  \d\d\d[A-Z][A-Z]      # followed by 3 decimal digits and 2 letters ("###XX")
  -                     # followed by a literal hyphen
  [A-Z][A-Z]\d\d\d\d    # followed by 2 letters and 4 decimal digits ("NN####")
  $                     # followed by end-of-text
  " , RegexOptions.IgnorePatternWhitespace|RegexOptions.IgnoreCase
  ) ;

You should also anchor your match to start/end of text (unless you're willing to accept a match other than the entire string.) 您还应该将匹配项锚定到文本的开头/结尾(除非您愿意接受整个字符串以外的匹配项)。

试试这个:

string masterCourseRegex = @"(XL-)?(\w{4}-\w{4}-\w{5}-\w{6})";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM