简体   繁体   中英

What is the macro definition of isupper in C?

I want to know how the "isupper" macro is defined in C/C++. Could you please provide me the same or point me to available resources. I tried looking at ctype.h but couldnt figure it out.

It's implementation defined -- every vendor can, and usually does, do it differently.

The most common usually involves a "traits" table - an array with one element for each character, the value of that element being a collection of flags indicates details about the character. An example would be:

 traits[(int) 'C'] = ALPHA | UPPER | PRINTABLE;

In which case,, isupper() would be something like:

 #define isupper(c) ((traits[(int)(c)] & UPPER) == UPPER)

It's a function, not a macro. The function definition of isupper() differs depending on things like locale and the current character set - that's why there's a function specifically for this purpose.

For ASCII, because of the way the letters are assigned, it's actually quite easy to test for this. If the ASCII code of the character falls in between 0x41 and 0x5A inclusive, then it is an upper case letter.

It's implementation-specific. One obvious way to implement it would be:

extern char *__isupper;
#define isupper(x) ((int)__isupper[(x)])

Where __isupper points to an array of 0's and 1's determined by the locale. However this sort of technique has gone out of favor since accessing global variables in shared libraries is rather inefficient and creates permanent ABI requirements, and since it's incompatible with POSIX thread-local locales.

Another obvious way to implement it on ASCII-only or UTF-8-only implementations is:

#define isupper(x) ((unsigned)(x)-'A'<='Z'-'A')

It's actually fairly complicated, in GCC for instance. But a simple implementation of isupper could be (although it has a double-evaluation bug) most simply defined as:

#define isupper(c) (c >= 'A') & (c <= 'Z')

http://ideone.com/GlN05

GCC specifically checks bit 0 is 1 in the character for the current locale:

(*__ctype_b_loc ())[(int) (c)] & (unsigned short int) (1 << (0))

Where __ctype_b_loc() is a function that returns a pointer into an array of characters in the current locale that contains characteristics for each character in the current character set.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM