Roslyn - How can I replace multiple nodes with multiple nodes each?

Question

Background:

Using Roslyn with C#, I am trying to expand auto-implemented properties, so that the accessor bodies can have code injected by later processing. I am using StackExchange.Precompilation as the compiler hook, so these syntax transformations occur in the build pipeline, not as part of an analyzer or refactoring.

I want to turn this:

[SpecialAttribute]
int AutoImplemented { get; set; }

into this:

[SpecialAttribute]
int AutoImplemented {
    get { return _autoImplemented; }
    set { _autoImplemented = value; }
}

private int _autoImplemented;

The problem:

I have been able to get simple transformations working, but I'm stuck on auto-properties, and a few others that are similar in some ways. The trouble I'm having is in using the SyntaxNodeExtensions.ReplaceNode and SyntaxNodeExtensions.ReplaceNodes extension methods correctly when replacing more than one node in a tree.

I am using a class extending CSharpSyntaxRewriter for the transformations. I'll just share the relevant members of that class here. This class visits each class and struct declaration, and then replaces any property declarations that are marked with SpecialAttribute .

private readonly SemanticModel model;

public override SyntaxNode VisitClassDeclaration(ClassDeclarationSyntax node) {
    if (node == null) throw new ArgumentNullException(nameof(node));
    node = VisitMembers(node);
    return base.VisitClassDeclaration(node);
}

public override SyntaxNode VisitStructDeclaration(StructDeclarationSyntax node) {
    if (node == null) throw new ArgumentNullException(nameof(node));
    node = VisitMembers(node);
    return base.VisitStructDeclaration(node);
}

private TNode VisitMembers<TNode>(TNode node)
    where TNode : SyntaxNode {

    IEnumerable<PropertyDeclarationSyntax> markedProperties = 
        node.DescendantNodes()
            .OfType<PropertyDeclarationSyntax>()
            .Where(prop => prop.HasAttribute<SpecialAttribute>(model));

    foreach (var prop in markedProperties) {
        SyntaxList<SyntaxNode> expanded = ExpandProperty(prop);
        //If I set a breakpoint here, I can see that 'expanded' will hold the correct value.
        //ReplaceNode appears to not be replacing anything
        node = node.ReplaceNode(prop, expanded);
    }

    return node;
}

private SyntaxList<SyntaxNode> ExpandProperty(PropertyDeclarationSyntax node) {
    //Generates list of new syntax elements from original.
    //This method will produce correct output.
}

HasAttribute<TAttribute> is an extension method I defined for PropertyDeclarationSyntax that checks if that property has an attribute of the given type. This method works correctly.

I believe I am just not using ReplaceNode correctly. There are three related methods:

TRoot ReplaceNode<TRoot>(
    TRoot root,
    SyntaxNode oldNode,
    SyntaxNode newNode);

TRoot ReplaceNode<TRoot>(
    TRoot root,
    SyntaxNode oldNode,
    IEnumerable<SyntaxNode> newNodes);

TRoot ReplaceNodes<TRoot, TNode>(
    TRoot root, 
    IEnumerable<TNode> nodes, 
    Func<TNode, TNode, SyntaxNode> computeReplacementNode);

I am using the second one, because I need to replace each property node with both field and property nodes. I need to do this with many nodes, but there is no overload of ReplaceNodes that allows one-to-many node replacement. The only way I found around having that overload was using a foreach loop, which seems very 'imperative' and against the functional feel of the Roslyn API.

Is there a better way to perform batch transformations like this?

Update: I found a great blog series on Roslyn and dealing with its immutability. I haven't found the exact answer yet, but it looks like a good place to start. https://joshvarty.wordpress.com/learn-roslyn-now/

Update: So here is where I'm really confused. I know that the Roslyn API is all based on immutable data structures, and the problem here is in a subtlety of how the copying of structures is used to mimic mutability. I think the problem is that every time I replace a node in my tree, I then have a new tree, and so when I call ReplaceNode that tree supposedly doesn't contain my original node that I want to replace.

It is my understanding that the way trees are copied in Roslyn is that, when you replace a node in a tree you actually create a new tree that references all the same nodes of the original tree, except the node you replaced and all nodes directly above that one. The nodes below the replaced node may be removed if the replacement node no longer references them, or new references may be added, but all the old references still point to the same node instances as before. I am pretty sure this is exactly what Anders Hejlsberg describes in this interview on Roslyn (20 to 23 min in).

So shouldn't my new node instance still contain the same prop instances found in my original sequence?

Hacky solution for special cases:

I was finally able to get this particular problem of transforming property declarations to work by relying on property identifiers, which will not change in any tree transformations. However, I would still like a general solution for replacing multiple nodes with multiple nodes each. This solution is really working around the API not through it.

Here is the special case solution:

private TNode VisitMembers<TNode>(TNode node)
    where TNode : SyntaxNode {

    IEnumerable<PropertyDeclarationSyntax> markedPropertyNames = 
        node.DescendantNodes()
            .OfType<PropertyDeclarationSyntax>()
            .Where(prop => prop.HasAttribute<SpecialAttribute>(model))
            .Select(prop => prop.Identifier.ValueText);

    foreach (var prop in markedPropertyNames) {
        var oldProp = node.DescendantNodes()
            .OfType<PropertyDeclarationSyntax>()
            .Single(p => p.Identifier.ValueText == prop.Name);

        SyntaxList<SyntaxNode> newProp = ExpandProperty(oldProp);

        node = node.ReplaceNode(oldProp, newProp);
    }

    return node;
}

Another similar problem I am working with is modifying all return statements in a method to insert postcondition checks. This case cannot obviously rely on any kind of unique identifier like a property declaration.

Answer 1

When you do that:

 foreach (var prop in markedProperties) {
    SyntaxList<SyntaxNode> expanded = ExpandProperty(prop);
    //If I set a breakpoint here, I can see that 'expanded' will hold the correct value.
    //ReplaceNode appears to not be replacing anything
    node = node.ReplaceNode(prop, expanded);
}

After the first replacing, node (your class for example) does not contains the original property anymore .

In Roslyn, everything is immutable, so the first replace should work for you, and the you have a new tree\\node.

To make it work you can consider one of the following:

Build the result in your rewriter class, without changing the original tree, and when you finishing, replace all at once. In your case, its mean replace the class note at once. I think its good option when you want to replace statement (I used it when I wrote code to convert linq query (comprehension) to fluent syntax) but for all class, maybe it's not optimal.
Use SyntaxAnnotaion \\ TrackNodes to find node after the tree has changed. With these options you can change the tree as you want and you can still keep track of the old nodes in the new tree.
Use DocumentEditor its let you do multiple changes to a document and then return a new Document.

If you need example for one of them, let me know.

Roslyn - How can I replace multiple nodes with multiple nodes each?

Question

1 answers

solution1
3 2016-12-08 16:41:35

Roslyn - How can I replace multiple nodes with multiple nodes each?

Question

1 answers

solution1 3 2016-12-08 16:41:35

solution1
3 2016-12-08 16:41:35