简体   繁体   English

C++ 扩展 Ascii 字符

[英]C++ Extended Ascii characters

How to detect the presence of Extended ASCII values (128 to 255) in a C++ character array.如何检测 C++ 字符数组中是否存在扩展 ASCII 值(128 到 255)。

Please remember that there is no such thing as extended ASCII.请记住,没有扩展 ASCII 这样的东西。 ASCII was and is only defined between 0 and 127. Everything above that is either invalid or needs to be in a defined encoding other than ASCII (for example ISO-8859-1). ASCII 过去和现在只定义在 0 和 127 之间。以上所有内容要么无效,要么需要使用 ASCII 以外的定义编码(例如 ISO-8859-1)。

Please read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) .请阅读每个软件开发人员绝对、绝对必须了解 Unicode 和字符集的绝对最低要求(没有借口!)

Other than that: what's wrong with iterating over it and check for any value > 127 (or <0 when using signed char s)?除此之外:迭代它并检查任何大于 127 的值(或使用 signed char时小于 0)有什么问题?

Char can be signed or unsigned. Char 可以有符号或无符号。 This doesn't really matter, though.不过,这并不重要。 You actually want to check if each character is valid ASCII.您实际上想检查每个字符是否是有效的 ASCII。 This is a positive, non-ambiguous check.这是一个积极的、明确的检查。 You simply check if each char is both >=0 and <= 127. Anything else (whether positive or negative, "Extended ASCII" or UTF-8) is invalid.您只需检查每个字符是否同时 >=0 和 <= 127。其他任何内容(无论是正数还是负数,“扩展 ASCII”或 UTF-8)都是无效的。

Iterate over array and check that each character doesn't fall in 128 to 255 range?遍历数组并检查每个字符是否不在 128 到 255 范围内?

Make sure you know the endianness of the machine in question, and just check the highest bit with a bitwise AND mask:确保您知道相关机器的字节顺序,并使用按位与掩码检查最高位:

if (ch & 128) {
  // high bit is set
} else {
  // looks like a 7-bit value
}

But there are probably locale functions you should be using for this.但是您可能应该为此使用语言环境功能。 Better yet, KNOW what character encoding data is coming in as.更好的是,知道输入的字符编码数据是什么。 Trying to guess it is like trying to guess the format of data going into your database fields.试图猜测它就像试图猜测进入数据库字段的数据格式。 It might go in, but garbage in, garbage out.它可能会输入 go,但垃圾输入,垃圾输出。

Doesn't anyone use isascii anymore?现在没有人使用 isascii 了吗?

char c = (char) 200;

if (isascii(c))
{
    cout << "it's ascii!" << endl;
}
else
{
    cout << "it's not ascii!" << endl;
}

Check the values that they are not negative检查它们不是负数的值

bool detect(const signed char* x) {
  while (*x++ > 0);
  return x[-1];
}
(char) c = (char) 200;

if (isascii(c))
{
    cout << "it's ascii!" << endl;
}
else
{
    cout << "it's not ascii!" << endl;
}

try this code试试这个代码

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM