Using Roslyn to replace all nodes within span

Question

I have a very large amount of generated C# code which I am wishing to pre-process using Roslyn in order to assist with subsequent manual refactoring.

The code contains start and end comment blocks with a known structure, and I need to refactor the code between the blocks into methods.

Fortunately all state in the generated code is global so we can guarantee that the target methods will require no arguments.

For example, the following code:

public void Foo()
{
    Console.WriteLine("Before block");

    // Start block
    var foo = 1;
    var bar = 2;
    // End block

    Console.WriteLine("After block");
}

Should be converted into something similar to:

public void Foo()
{
    Console.WriteLine("Before block");

    TestMethod();

    Console.WriteLine("After block");
}

private void TestMethod()
{
    var foo = 1;
    var bar = 2;
}

Obviously, this is a contrived example. A single method may have any number of these comment and code blocks.

I have looked into CSharpSyntaxRewriter and have got as far as extracting a collection of SyntaxTrivia objects for these comments. My naive approach was to override VisitMethodDeclaration() , identify the span range of the code between the start and end comment blocks and somehow extract the nodes.

I have been able to use node.GetText().Replace(codeSpan) , but I don't know how I can use the result.

I have seen many examples of using CSharpSyntaxRewriter , but all seem suspiciously trivial and not involving refactorig involving multiple related nodes.

Would I be better using a DocumentEditor ? Is there a common approach for this sort of refactoring?

I could be lazy and not use Roslyn at all, but structured parsing of code seems a more elegant solution than regular expressions and treating the source as plain text.

Answer 1

I've managed to get promising results with DocumentEditor .

My code looks like somebody has fumbled their way, trial and error, through the SDK, and the approach to remove the trailing comments seems remarkably janky, but it all seems to work (at least for trivial examples).

Here's the rough as guts proof of concept.

public class Program
{
    static async Task Main()
    {
        var document = CreateDocument(@"..\..\..\TestClass.cs");

        var refactoredClass = await Refactor(document);
        Console.Write(await refactoredClass.GetTextAsync());
    }

    private static async Task<Document> Refactor(Document document)
    {
        var documentEditor = await DocumentEditor.CreateAsync(document);

        var syntaxRoot = await document.GetSyntaxRootAsync();
        var comments = syntaxRoot
            .DescendantTrivia()
            .Where(t => t.IsKind(SyntaxKind.SingleLineCommentTrivia))
            .ToList();

        // Identify comments which are used to target candidate code to be refactored
        var startComments = new Queue<SyntaxTrivia>(comments.Where(c => c.ToString().TrimEnd() == "// Start block"));
        var endBlock = new Queue<SyntaxTrivia>(comments.Where(c => c.ToString().TrimEnd() == "// End block"));

        // Identify class in target file
        var parentClass = syntaxRoot.DescendantNodes().OfType<ClassDeclarationSyntax>().First();

        var blockIndex = 0;

        foreach (var startComment in startComments)
        {
            var targetMethodName = $"TestMethod_{blockIndex}";

            var endComment = endBlock.Dequeue();

            // Create invocation for method containing refactored code
            var testMethodInvocation =
                ExpressionStatement(
                        InvocationExpression(
                            IdentifierName(targetMethodName)))
                    .WithLeadingTrivia(Whitespace("\n"))
                    .WithTrailingTrivia(Whitespace("\n\n"));

            // Identify nodes between start and end comments, recursing only for nodes outside comments
            var nodes = syntaxRoot.DescendantNodes(c => c.SpanStart <= startComment.Span.Start)
                .Where(n =>
                    n.Span.Start > startComment.Span.End &&
                    n.Span.End < endComment.SpanStart)
                .Cast<StatementSyntax>()
                .ToList();

            // Construct list of nodes to add to target method, removing starting comment
            var targetNodes = nodes.Select((node, nodeIndex) => nodeIndex == 0 ? node.WithoutLeadingTrivia() : node).ToList();

            // Remove end comment trivia which is attached to the node after the nodes we have refactored
            // FIXME this is nasty and doesn't work if there are no nodes after the end comment
            var endCommentNode = syntaxRoot.DescendantNodes().FirstOrDefault(n => n.SpanStart > nodes.Last().Span.End && n is StatementSyntax);
            if (endCommentNode != null) documentEditor.ReplaceNode(endCommentNode, endCommentNode.WithoutLeadingTrivia());

            // Create target method, containing selected nodes
            var testMethod =
                MethodDeclaration(
                        PredefinedType(
                            Token(SyntaxKind.VoidKeyword)),
                        Identifier(targetMethodName))
                    .WithModifiers(
                        TokenList(
                            Token(SyntaxKind.PublicKeyword)))
                    .WithBody(Block(targetNodes))
                    .NormalizeWhitespace()
                    .WithTrailingTrivia(Whitespace("\n\n"));

            // Add method invocation
            documentEditor.InsertBefore(nodes.Last(), testMethodInvocation);

            // Remove nodes from main method
            foreach (var node in nodes) documentEditor.RemoveNode(node);

            // Add new method to class
            documentEditor.InsertMembers(parentClass, 0, new List<SyntaxNode> { testMethod });

            blockIndex++;
        }

        // Return formatted document
        var updatedDocument = documentEditor.GetChangedDocument();
        return await Formatter.FormatAsync(updatedDocument);
    }

    private static Document CreateDocument(string sourcePath)
    {
        var workspace = new AdhocWorkspace();
        var projectId = ProjectId.CreateNewId();
        var versionStamp = VersionStamp.Create();
        var projectInfo = ProjectInfo.Create(projectId, versionStamp, "NewProject", "Test", LanguageNames.CSharp);
        var newProject = workspace.AddProject(projectInfo);

        var source = File.ReadAllText(sourcePath);
        var sourceText = SourceText.From(source);

        return workspace.AddDocument(newProject.Id, Path.GetFileName(sourcePath), sourceText);
    }
}

I'd be interested to see if I'm making life hard for myself with any of this -- I'm sure there's more elegant ways to do what I'm trying to do.

Using Roslyn to replace all nodes within span

Question

1 answers

solution1
0 ACCPTED 2019-08-23 08:13:16

Using Roslyn to replace all nodes within span

Question

1 answers

solution1 0 ACCPTED 2019-08-23 08:13:16

solution1
0 ACCPTED 2019-08-23 08:13:16