调试单元测试时执行随机跳转抛出异常

Question

I'm having a very strange problem where my execution jumps from semi-predictable locations to another location while debugging a Visual Studio .NET unit test.我遇到了一个非常奇怪的问题，在调试 Visual Studio .NET 单元测试时，我的执行从半可预测的位置跳转到另一个位置。 The method in which this strange behavior occurs is "Parse(...)", below.发生这种奇怪行为的方法是下面的“Parse(...)”。 I've indicated in this method the one location where execution will jump to ("// EXCEPTION").我已经在这个方法中指出了执行将跳转到的一个位置（“// EXCEPTION”）。 I've also indicated several of the places where, in my testing, execution was when it strangely jumped ("// JUMP").我还指出了几个地方，在我的测试中，执行是在它奇怪地跳跃时（“// JUMP”）。 The jump will frequently occur from the same place several times consecutively, and then begin jumping from a new location consecutively.跳跃经常会从同一个地方连续发生几次，然后又从一个新的位置开始连续跳跃。 These places from which execution jumps are either the beginning of switch statements or the ends of code blocks, suggesting to me that there is something funky going on with the instruction pointer, but I'm not .NET savvy enough to know what that something might be.这些执行跳转的地方要么是 switch 语句的开头，要么是代码块的结尾，这向我暗示指令指针发生了一些奇怪的事情，但我的 .NET 不够精明，不知道那可能是什么. If it makes any difference, the execution does not jump to immediately before the "throw" statement, but instead to a point in execution where the exception has just been thrown.如果有任何区别，执行不会立即跳转到“throw”语句之前，而是跳转到刚刚抛出异常的执行点。 Very strange.很奇怪。

In my experience the execution jump only occurs while parsing the contents of a nested named group.以我的经验，执行跳转仅在解析嵌套命名组的内容时发生。

Background on what the code below purports to do: the solution I'm trying to implement is a simple regular expression parser.下面的代码声称要做的背景：我试图实现的解决方案是一个简单的正则表达式解析器。 This is not a full-out regex parser.这不是一个完整的正则表达式解析器。 My needs are only to be able to locate particular named groups inside a regular expression and replace some of the named groups' contents with other contents.我的需要只是能够在正则表达式中找到特定的命名组，并将一些命名组的内容替换为其他内容。 So basically I'm just running through a regular expression and keeping track of named groups I find.所以基本上我只是在运行一个正则表达式并跟踪我找到的命名组。 I also keep track of unnamed groups, since I need to be aware of parenthesis matching, and comments, so that commented parentheses don't upset paren-matching.我还跟踪未命名的组，因为我需要注意括号匹配和注释，以便注释括号不会扰乱括号匹配。 A separate (and as-of-yet unimplemented) piece of code will reconstruct a string containing the regex after taking into account the replacements.考虑到替换后，一个单独的（并且尚未实现的）代码将重建一个包含正则表达式的字符串。

I greatly appreciate any suggestions of what might be afoot;我非常感谢任何关于可能发生的事情的建议； I'm baffled!我很困惑！

Example Solution示例解决方案

Here is a Visual Studio 2010 Solution (TAR format) containing all the code I discuss below.这是一个 Visual Studio 2010 解决方案（TAR 格式），其中包含我在下面讨论的所有代码。 I have the error when running this solution (with the unit test project "TestRegexParserLibTest" as the Startup Project.) Since this seems to be such a sporadic error, I'd be interested if anyone else experiences the same problem.运行此解决方案时出现错误（将单元测试项目“TestRegexParserLibTest”作为启动项目。）由于这似乎是一个零星的错误，如果其他人遇到同样的问题，我会很感兴趣。

Code代码

I use some simple classes to organize the results:我使用一些简单的类来组织结果：

// The root of the regex we are parsing
public class RegexGroupStructureRoot : ISuperRegexGroupStructure
{
    public List<RegexGroupStructure> SubStructures { get; set; }

    public RegexGroupStructureRoot()
    {
        SubStructures = new List<RegexGroupStructure>();
    }

    public override bool Equals(object obj) { ... }
}

// Either a RegexGroupStructureGroup or a RegexGroupStructureRegex
// Contained within the SubStructures of both RegexGroupStructureRoot and RegexGroupStructureGroup
public abstract class RegexGroupStructure
{
}

// A run of text containing regular expression characters (but not groups)
public class RegexGroupStructureRegex : RegexGroupStructure
{
    public string Regex { get; set; }

    public override bool Equals(object obj) { ... }
}

// A regular expression group
public class RegexGroupStructureGroup : RegexGroupStructure, ISuperRegexGroupStructure
{
    // Name == null indicates an unnamed group
    public string Name { get; set; }
    public List<RegexGroupStructure> SubStructures { get; set; }

    public RegexGroupStructureGroup()
    {
        SubStructures = new List<RegexGroupStructure>();
    }

    public override bool Equals(object obj) { ... }
}

// Items that contain SubStructures
// Either a RegexGroupStructureGroup or a RegexGroupStructureRoot
interface ISuperRegexGroupStructure
{
    List<RegexGroupStructure> SubStructures { get; }
}

Here's the method (and associated enum/static members) where I actually parse the regular expression, returning a RegexGroupStructureRoot that contains all the named groups, unnamed groups, and other regular expression characters that were found.这是我实际解析正则表达式的方法（和关联的枚举/静态成员），返回一个 RegexGroupStructureRoot，其中包含所有已找到的命名组、未命名组和其他正则表达式字符。

using Re = System.Text.RegularExpressions

enum Mode
{
    TopLevel, // Not in any group
    BeginGroup, // Just encountered a character beginning a group: "("
    BeginGroupTypeControl, // Just encountered a character controlling group type, immediately after beginning a group: "?"
    NamedGroupName, // Reading the named group name (must have encountered a character indicating a named group type immediately following a group type control character: "<" after "?")
    NamedGroup, // Reading the contents of a named group
    UnnamedGroup, // Reading the contents of an unnamed group
}

static string _NamedGroupNameValidCharRePattern = "[A-Za-z0-9_]";
static Re.Regex _NamedGroupNameValidCharRe;

static RegexGroupStructureParser()
{
    _NamedGroupNameValidCharRe = new Re.Regex(_NamedGroupNameValidCharRePattern);
}

public static RegexGroupStructureRoot Parse(string regex)
{
    string newLine = Environment.NewLine;
    int newLineLen = newLine.Length;

    // A record of the parent structures that the parser has created
    Stack<ISuperRegexGroupStructure> parentStructures = new Stack<ISuperRegexGroupStructure>();

    // The current text we've encountered
    StringBuilder textConsumer = new StringBuilder();

    // Whether the parser is in an escape sequence
    bool escaped = false;

    // Whether the parser is in an end-of-line comment (such comments run from a hash-sign ('#') to the end of the line
    //  The other type of .NET regular expression comment is the group-comment: (?#This is a comment)
    //   We do not need to specially handle this type of comment since it is treated like an unnamed
    //   group.
    bool commented = false;

    // The current mode of the parsing process
    Mode mode = Mode.TopLevel;

    // Push a root onto the parents to accept whatever regexes/groups we encounter
    parentStructures.Push(new RegexGroupStructureRoot());

    foreach (char chr in regex.ToArray())
    {
        if (escaped) // JUMP
        {
            textConsumer.Append(chr);
            escaped = false;
        }
        else if (chr.Equals('#'))
        {
            textConsumer.Append(chr);
            commented = true;
        }
        else if (commented)
        {
            textConsumer.Append(chr);

            string txt = textConsumer.ToString();
            int txtLen = txt.Length;
            if (txtLen >= newLineLen &&
                // Does the current text end with a NewLine?
                txt.Substring(txtLen - 1 - newLineLen, newLineLen) == newLine)
            {
                // If so we're no longer in the comment
                commented = false;
            }
        }
        else
        {
            switch (mode) // JUMP
            {
                case Mode.TopLevel:
                    switch (chr)
                    {
                        case '\\':
                            textConsumer.Append(chr); // Append the backslash
                            escaped = true;
                            break;
                        case '(':
                            beginNewGroup(parentStructures, ref textConsumer, ref mode);
                            break;
                        case ')':
                            // Can't close a group if we're already at the top-level
                            throw new InvalidRegexFormatException("Too many ')'s.");
                        default:
                            textConsumer.Append(chr);
                            break;
                    }
                    break;

                case Mode.BeginGroup:
                    switch (chr)
                    {
                        case '?':
                            // If it's an unnamed group, we'll re-add the question mark.
                            // If it's a named group, named groups reconstruct question marks so no need to add it.
                            mode = Mode.BeginGroupTypeControl;
                            break;
                        default:
                            // Only a '?' can begin a named group.  So anything else begins an unnamed group.

                            parentStructures.Peek().SubStructures.Add(new RegexGroupStructureRegex()
                            {
                                Regex = textConsumer.ToString()
                            });
                            textConsumer = new StringBuilder();

                            parentStructures.Push(new RegexGroupStructureGroup()
                            {
                                Name = null, // null indicates an unnamed group
                                SubStructures = new List<RegexGroupStructure>()
                            });

                            mode = Mode.UnnamedGroup;
                            break;
                    }
                    break;

                case Mode.BeginGroupTypeControl:
                    switch (chr)
                    {
                        case '<':
                            mode = Mode.NamedGroupName;
                            break;

                        default:
                            // We previously read a question mark to get here, but the group turned out not to be a named group
                            // So add back in the question mark, since unnamed groups don't reconstruct with question marks
                            textConsumer.Append('?' + chr);
                            mode = Mode.UnnamedGroup;
                            break;
                    }
                    break;

                case Mode.NamedGroupName:
                    if (chr.Equals( '>'))
                    {
                        // '>' closes the named group name.  So extract the name
                        string namedGroupName = textConsumer.ToString();

                        if (namedGroupName == String.Empty)
                            throw new InvalidRegexFormatException("Named group names cannot be empty.");

                        // Create the new named group
                        RegexGroupStructureGroup newNamedGroup = new RegexGroupStructureGroup() {
                            Name = namedGroupName,
                            SubStructures = new List<RegexGroupStructure>()
                        };

                        // Add this group to the current parent
                        parentStructures.Peek().SubStructures.Add(newNamedGroup);
                        // ...and make it the new parent.
                        parentStructures.Push(newNamedGroup);

                        textConsumer = new StringBuilder();

                        mode = Mode.NamedGroup;
                    }
                    else if (_NamedGroupNameValidCharRe.IsMatch(chr.ToString()))
                    {
                        // Append any valid named group name char to the growing named group name
                        textConsumer.Append(chr);
                    }
                    else
                    {
                        // chr is neither a valid named group name character, nor the character that closes the named group name (">").  Error.
                        throw new InvalidRegexFormatException(String.Format("Invalid named group name character: {0}", chr)); // EXCEPTION
                    }
                    break; // JUMP

                case Mode.NamedGroup:
                case Mode.UnnamedGroup:
                    switch (chr) // JUMP
                    {
                        case '\\':
                            textConsumer.Append(chr);
                            escaped = true;
                            break;
                        case ')':
                            closeGroup(parentStructures, ref textConsumer, ref mode);
                            break;
                        case '(':
                            beginNewGroup(parentStructures, ref textConsumer, ref mode);
                            break;
                        default:
                            textConsumer.Append(chr);
                            break;
                    }
                    break;

                default:
                    throw new Exception("Exhausted Modes");
            }
        } // JUMP
    }

    ISuperRegexGroupStructure finalParent = parentStructures.Pop();
    Debug.Assert(parentStructures.Count < 1, "Left parent structures on the stack.");
    Debug.Assert(finalParent.GetType().Equals(typeof(RegexGroupStructureRoot)), "The final parent must be a RegexGroupStructureRoot");

    string finalRegex = textConsumer.ToString();
    if (!String.IsNullOrEmpty(finalRegex))
        finalParent.SubStructures.Add(new RegexGroupStructureRegex() {
            Regex = finalRegex
        });

    return finalParent as RegexGroupStructureRoot;
}

And here is a unit test that will test if the method works (note, may not be 100% correct since I don't even get past the call to RegexGroupStructureParser.Parse.)这是一个单元测试，它将测试该方法是否有效（注意，可能不是 100% 正确，因为我什至没有通过对 RegexGroupStructureParser.Parse 的调用。）

[TestMethod]
public void ParseTest_Short()
{
    string regex = @"
        (?<Group1>
            ,?\s+
            (?<Group1_SubGroup>
                [\d–-]+             # One or more digits, hyphen, and/or n-dash
            )            
        )
    ";

    RegexGroupStructureRoot expected = new RegexGroupStructureRoot()
    {
        SubStructures = new List<RegexGroupStructure>()
        {
            new RegexGroupStructureGroup() {
                Name = "Group1", 
                SubStructures = new List<RegexGroupStructure> {
                    new RegexGroupStructureRegex() {
                        Regex = @"
            ,?\s+
            "
                    }, 
                    new RegexGroupStructureGroup() {
                        Name = "Group1_Subgroup", 
                        SubStructures = new List<RegexGroupStructure>() {
                            new RegexGroupStructureRegex() {
                                Regex = @"
                [\d–-]+             # One or more digits, hyphen, and/or n-dash
            "
                            }
                        }
                    }, 
                    new RegexGroupStructureRegex() {
                        Regex = @"            
        "
                    }
                }
            }, 
            new RegexGroupStructureRegex() {
                Regex = @"
        "
            }, 
        }
    };

    RegexGroupStructureRoot actual = RegexGroupStructureParser.Parse(regex);

    Assert.AreEqual(expected, actual);
}

Answer 1

Your solution's test case does cause the thrown "Invalid named group name character" exception to halt at the break;您的解决方案的测试用例确实导致抛出的“无效的命名组名字符”异常在中断时停止break; rather than the throw line.而不是throw线。 I rigged up a test file using a nested if in a case to see if the exception triggers similarly in one of my projects and it did not: the halted line was the throw statement itself.我在一个案例中使用嵌套的 if 装配了一个测试文件，以查看异常是否在我的一个项目中类似地触发，但它没有：暂停的行是throw语句本身。

However, when I enable editing (to use edit and continue in your project), the current line rewinds back to the throw statement.但是，当我启用编辑（在您的项目中使用编辑并继续）时，当前行会倒回到 throw 语句。 I haven't looked at the generated IL, but I suspect that the throw (which will terminate the case without needing the "break" to follow as so:)我还没有查看生成的 IL，但我怀疑 throw （这将终止此案例，而无需像这样跟随“break”：）

case 1:
   do something
   break;
case 2:
   throw ... //No break required.
case 3:

is being optimized in a way that is confusing the display, but not the actual execution or even the edit and continue feature.正在以一种混淆显示的方式进行优化，但不是实际执行，甚至是编辑和继续功能。 If the edit and continue works and the thrown exceptions are properly caught or displayed I suspect you have a display anomaly that you can ignore (although I would report it to Microsoft along with that file as it is reproducible).如果编辑和继续工作并且抛出的异常被正确捕获或显示，我怀疑您有一个可以忽略的显示异常（尽管我会将它与该文件一起报告给 Microsoft，因为它是可重现的）。

Answer 2

Finally solved this one.终于解决了这个问题。 In closeGroup , which is referenced in my question and exists in the linked code, I was setting the mode to NamedGroupName instead of NamedGroup .在我的问题中引用并存在于链接代码中的closeGroup中，我将模式设置为NamedGroupName而不是NamedGroup 。 This still doesn't answer the strange instruction pointer/exception business, but at least now I don't get the unexpected exception and the parser parses.这仍然没有回答奇怪的指令指针/异常业务，但至少现在我没有得到意外的异常并且解析器解析。

调试单元测试时执行随机跳转抛出异常

问题描述

Example Solution示例解决方案

Code代码

2 个解决方案

解决方案1
1 已采纳 2011-07-08 19:47:06

解决方案2
0 2011-08-28 00:51:52

调试单元测试时执行随机跳转抛出异常

问题描述

Example Solution示例解决方案

Code代码

2 个解决方案

解决方案1 1 已采纳 2011-07-08 19:47:06

解决方案2 0 2011-08-28 00:51:52

解决方案1
1 已采纳 2011-07-08 19:47:06

解决方案2
0 2011-08-28 00:51:52