简体   繁体   中英

How to map C# compiler error location (line, column) onto the SyntaxTree produced by Roslyn API?

So:

  • The C# compiler outputs the (line,column) style location.
  • The Roslyn API expects sequential text location

How to map the former to the latter?

The C# code could be UTF8 with or without the BOM or even UTF16. It could contain all kinds of characters in the form of comments or embedded strings.

Let us assume we know the encoding and have the respective Encoding object handy. I can convert the file bytes to char[] . The problem is that some chars may contribute zero to the final sequential position. I know that the BOM character does. I have no idea if others may too.

Now, if we know for sure that BOM is the only character that contributes 0 to the length, then I can skip it and count the characters and my question becomes trivial. This is what I do today - I just assume that the BOM is the only "bad" player.

But maybe there is a better way? Maybe Roslyn API contains some hidden gem that knows for a change to accept (line,column) and spit the sequential position? Or maybe some of the Microsoft.Build libraries?

EDIT 1

As per the accepted answer the following gives the location:

var srcText = SourceText.From(File.ReadAllText(err.FilePath));
int location = srcText.Lines[err.Line - 1].Start + err.Column - 1;

You have uncovered the reason that the SourceText type exists in the roslyn apis. Its entire purpose is to handle encoding of strings and preform calculations of lines, columns, and spans.

Due to the way .NET handles unicode and depending on which code pages are installed in your OS there could be cases that SourceText does not do what you need. It has generally proven "good enough" for our purposes though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM