简体繁体中英

What is the default encoding for C strings?

原文 2010-10-22 10:43:16 2 5 c/ string

I know that C strings are char[] with a '\\0' in the last element. But how are the chars encoded?

Update: I found this cool link which talks about many other programming languages and their encoding conventions: Link

5 answers

All the standard says on the matter is that you get at least the 52 upper- and lower-case latin alphabet characters, the digits 0 to 9, the symbols ! " # % & ' ( ) * + , - . / : ; < = > ? [ \\ ] ^ _ { | } ~ ! " # % & ' ( ) * + , - . / : ; < = > ? [ \\ ] ^ _ { | } ~ , and the space character, and control characters representing horizontal tab, vertical tab, and form feed.

The only thing it says about numeric encoding is that all of the above fits in one byte, and that the value of each digit after zero is one greater that the value of the previous one.

The actual encoding is probably inherited from your locale settings. Probably something ASCII-compatible.

A c string is pretty much just a sequence of bytes. That means, that it does not have a well-defined encoding, it could be ASCII, UTF8 or anything else, for that matter. Because most operating systems understand ASCII by default, and source code is mostly written with ASCII encoding, so the data you will find in a simple (char*) will very often be ASCII as well. Nonetheless, there is no guarantee that what you get out of a (char*) will be UTF8 or even KOI8.

The standard does not specify this. Typically with ASCII.

They are not really "encoded" as such, they are simply stored as-is. The string "hello" represents an array with the char values 'h' , 'e' , 'l' , 'l' , 'o' and '\\0' , in that order. The C standard has a basic character set that includes these characters, but doesn't specify an encoding into bytes. It could be EBCDIC, for all you know.

As other indicated already, C has some restrictions what is permitted for source and execution character encodings, but is relatively permissive. So in particular it is not necessarily ASCII, and in most cases nowadays at least an extensions of that.

Your execution environment is meant to do an eventual translation between source and execution character set. So generally you should not care about the encoding and in the contrary try to code independently of it. This why there are special escape sequences for special characters like '\\n' , or '\\t' and universal character encodings like '\Ά' . So usually you shouldn't have to look up the encodings for the execution character set yourself.

Adding and Subtracting Characters of Strings(Encoding / Decoding) in C

How to change a strings encoding as utf 8 in C

What is the default data structure in C?

What is the default type for C const?

What is the default value of Register in C

Encoding NanoPB variable Strings

Encoding strings and ints for a RPC

What's wrong with this strings comparison function in C? How are strings compared?

What is wrong with this code attempting to concatenate strings in C?

What is a safe way to join strings in C?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Adding and Subtracting Characters of Strings(Encoding / Decoding) in C How to change a strings encoding as utf 8 in C What is the default data structure in C? What is the default type for C const? What is the default value of Register in C Encoding NanoPB variable Strings Encoding strings and ints for a RPC What's wrong with this strings comparison function in C? How are strings compared? What is wrong with this code attempting to concatenate strings in C? What is a safe way to join strings in C?

Related Tags

What is the default encoding for C strings?

Question

5 answers

solution1
8 ACCPTED 2010-10-22 10:55:40

solution2
7 2010-10-22 10:56:38

solution3
6 2010-10-22 10:47:37

solution4
1 2010-10-22 10:47:52

solution5
1 2010-10-22 11:39:09

What is the default encoding for C strings?

Question

5 answers

solution1 8 ACCPTED 2010-10-22 10:55:40

solution2 7 2010-10-22 10:56:38

solution3 6 2010-10-22 10:47:37

solution4 1 2010-10-22 10:47:52

solution5 1 2010-10-22 11:39:09

solution1
8 ACCPTED 2010-10-22 10:55:40

solution2
7 2010-10-22 10:56:38

solution3
6 2010-10-22 10:47:37

solution4
1 2010-10-22 10:47:52

solution5
1 2010-10-22 11:39:09