简体   繁体   中英

Regular expression comments multiline

I would like to use Regular Expression for comments in multiline in C#. I have @"/[*][\\w\\d\\s]+[*]/" but with that expression only comments the text that appears between /* */ in singleline not in multiline.

Singleline:

       /* xxxxxxxx */

Multiline:

       /*
       xxxxxxx
       */

I don't know if I could explain well, but any questions or if you can refer to somewhere that provides this information I would appreciate it.

EDIT In my example I have in one class

. . .

    public IList<ClassificationSpan> GetClassificationSpans(SnapshotSpan span)
    {
        List<ClassificationSpan> classifications = new List<ClassificationSpan>();
        string current = span.GetText();
        bool commentFound = false;
        foreach(var item in _colorTextoLanguage.Comments)
        {
            Regex reg = new Regex(item, RegexOptions.IgnoreCase);
            var matches = reg.Matches(current);
            for(int i=0;i<matches.Count;i++)
            {
                commentFound = true;
                Match m =matches[i];
                Span new_span = new Span(span.Start.Position + m.Index, current.Length - m.Index);
                SnapshotSpan new_snapshot = new SnapshotSpan(span.Snapshot, new_span);
                var newText = new_snapshot.GetText();
                classifications.Add(new ClassificationSpan(new_snapshot, _commentType));
            }
        }
        if(commentFound)
            return classifications;
        Classify(classifications, current, span, _colorTextoLanguage.Custom, _classificationType);
        Classify(classifications, current, span, _colorTextoLanguage.Quoted, _stringType);
        Classify(classifications, current, span, _colorTextoLanguage.Keywords, _keywordType);
        Classify(classifications, current, span, _colorTextoLanguage.IdentifierTypes, _identifierType);
        Classify(classifications, current, span, _colorTextoLanguage.Numeric, _numericType);
        return classifications;
    }

. . .

and other class with

class ColorTextoLanguage
{
    #region Member Variables

    private List<string> _comments = new List<string>();
    private List<string> _quoted = new List<string>();
    private List<string> _numeric = new List<string>();
    private List<string> _keywords = new List<string>();
    private List<string> _identiferTypes = new List<string>();
    private List<string> _custom = new List<string>();


    #region Properties

    public List<string> Comments
    {
        get{return _comments;}
    }

    public List<string> Quoted
    {
        get{return _quoted;}
    }

    public List<string> Numeric
    {
        get{return _numeric;}
    }

    public List<string> Keywords
    {
        get{return _keywords;}
    }

    public List<string> IdentifierTypes
    {
        get{return _identifierTypes;}
    }

    public List<string> Custom
    {
        get{return _custom;}
    }

    #endregion

    #region ctor

    public ColorTextoLanguage()
    {
        Initialize();
    }

    #endregion

    #region Methods
    private void Initialize()
    {
        _comments.Add("//");
        _comments.Add(@"/\*(?:(?!\*/)(?:.|[\r\n]+))*\*/");

        _quoted.Add(@"([""'])(?:\\\1|.)*?\1");

        _numeric.Add(@"\b\d+\b")

        _keywords.Add(@"\bif\b");
        _keywords.Add(@"\belse\b");
        _keywords.Add(@"\bforeach\b");
        _keywords.Add(@"\bswitch\b");
        _keywords.Add(@"\bcase\b");
        .
        .
        .


        _identifierTypes.Add(@"\bint\b");
        _identifierTypes.Add(@"\bdate\b");
        _identifierTypes.Add(@"\bstring\b");
        .
        .
        .

    }
    #endregion
    #endregion
};

Not sure if this helps, but from what I see is quite similar to your example. Thanks in addvance

Try the regex:

/\*(?:(?!\*/).)*\*/

With RegexOptions.Singleline

new Regex(@"/\*(?:(?!\*/).)*\*/", RegexOptions.Singleline);

regex101 demo

(?:(?!\\*/).)* will match any character except */

EDIT: Version which should work in both modes:

/\*(?:(?!\*/)(?:.|[\r\n]+))*\*/
/\*([^*]*\*)*?/

/\\* Match a Forward Slash followed by an Asterisk

([^*]*\\*)*? (Match everything that is NOT a literal asterisk zero to infinite times, then match a literal asterisk), do this lazily zero or infinite times

/ Match a Forward Slash. If this fails, backtrack to the previous step and attempt one more lazy iteration.

Not only is this shorter and more clear, it also takes less Regex Steps to perform. It is the most efficient way.

Note: There is no need to worry about collections. But if you really care, you could make your group a non-capture group with ?:

    /\*(?:[^*]*\*)*?/

To match a multi line comment you need a simple regex like this:

Regex regex = new Regex(@"/\*.*?\*/", RegexOptions.Singleline);

Hope this helps you in your quest.

The issue it not the regex at all. The reason this is not working is the fact that only a single line of code is being passed into the GetClassificationSpans() function at any one time. I'm having the same issue as you and from the look of the code you have provided, we followed the same tutorial.

This is not really an answer, it'll just help you identify what the actual issue is.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM