简体   繁体   中英

How to convert string to char in C

I'm writing a compiler in C and need to get the ASCII value of a character defined in a source code file. For normal letters this is simple but is there any way to convert the string "\\n" to the ASCII number for '\\n' in C (needs to work on all characters)?

Cheers

If the string is one character long, you can just index it:

char *s = "\n";
int ascii = s[0];

However, if you are on a system where the character set used is not ASCII, the above will not give you an ASCII value. If you need to make sure your code runs on such rare machines, you can build yourself an ASCII table and use that.

If on the other hand, you have two characters, ie,

char *s = "\\n";

then you can do something like this:

char c;
c = s[0];
if (c == '\\') {
    c = s[1]; /* assume s is long enough */
    switch (c) {
        case 'n': return '\n'; break;
        case 't': return '\t'; break;
        ...
        default: return c;
    }
}

The above assumes that your current compiler knows what '\\n' means. If it doesn't, then you can still do it. For finding out how to do so, and a fascinating story, see Reflections on Trusting Trust by Ken Thompson.

I'm writing a compiler in C

Probably not a good idea to do it all in raw C. It's far better to be using something like Bison to handle the initial parsing.

That said, the best way of handling \\* escapes is just to have a lookup table of what each escape turns into.

You will need to write your own parser/converter. The list of escape sequences can be found online in many places. Parsing C style syntax is extremely difficult, so you may also wish to check out existing free implementations such as Clang .

You will need to implement this yourself. The reason is that what you are doing is determined by the String literal syntax of the language that you are compiling ! (The fact that your compiler is implemented in C is immaterial.)

There are conventional escape sequences for String literals that span multiple languages; eg \\n typically denotes the ASCII NewLine character. However, that doesn't mean that these conventions are appropriate for the language you are trying to compile.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM