简体   繁体   中英

How do you convert char numbers to decimal and back or convert ASCII 'A'-'Z'/'a'-'z' to letter offsets 0 for 'A'/'a' ...?

If you have a char that is in the range '0' to '9' how do you convert it to int values of 0 to 9

And then how do you convert it back?

Also given letters 'A' to 'Z' or 'a' to 'z' how do you convert them to the range 0-25 and then back?

It is okay to optimize for ASCII

The basic char encoding specified by C++ makes converting to and from '0' - '9' easy.

C++ specifies:

In both the source and execution basic character sets, the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous.

This means that, whatever the integral value of '0', the integral value of '1' is '0' + 1 , the integral value of '2' is '0' + 2 , and so on. Using this information and the basic rules of arithmetic you can convert from char to int and back easily:

char c = ...; // some value in the range '0' - '9'
int int_value = c - '0';

// int_value is in the range 0 - 9
char c2 = '0' + int_value;

Portably converting the letters 'a' to 'z' to numbers from 0 to 25 is not as easy because C++ does not specify that the values of these letters are consecutive. In ASCII they are consecutive, and you can write code that relies on that similar to the above code for '0' - '9'. (These days ASCII is used most everywhere).

Portable code would instead use a lookup table or a specific checks for each character:

char int_to_char[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};

int char_to_int[CHAR_MAX + 1] = {};

for (int i=0; i<sizeof(int_to_char); ++i) {
  char_to_int[int_to_char[i]] = i;
}

// convert a lowercase char letter to a number in the range 0 - 25:
int i = char_to_int['d'];

// convert an int in the range 0 - 25 to a char
char c = int_to_char[25];

In C99 you can just directly initialize the char_to_int[] data without a loop.

int char_to_int[] = {['a'] = 0, ['b'] = 1, ['c'] = 2, ['d'] = 3, ['e'] = 4, ['f'] = 5, ['g'] = 6, ['h'] = 7, ['i'] = 8, ['j'] = 9, ['k'] = 10, ['l'] = 11, ['m'] = 12, ['n'] = 13, ['o'] = 14, ['p'] = 15, ['q'] = 16, ['r'] = 17, ['s'] = 18, ['t'] = 19, ['u'] = 20, ['v'] = 21, ['w'] = 22, ['x'] = 23, ['y'] = 24, ['z'] = 25};

C++ compilers that also support C99 may support this in C++ as well, as an extension.


Here's a complete program that generates random values to use in these conversions. It uses C++, plus the C99 designated initialization extension.

#include <cassert>

int digit_char_to_int(char c) {
  assert('0' <= c && c <= '9');
  return c - '0';
}

char int_to_digit_char(int i) {
  assert(0 <= i && i <= 9);
  return '0' + i;
}

int alpha_char_to_int(char c) {
  static constexpr int char_to_int[] = {['a'] = 0, ['b'] = 1, ['c'] = 2, ['d'] = 3, ['e'] = 4, ['f'] = 5, ['g'] = 6, ['h'] = 7, ['i'] = 8, ['j'] = 9, ['k'] = 10, ['l'] = 11, ['m'] = 12, ['n'] = 13, ['o'] = 14, ['p'] = 15, ['q'] = 16, ['r'] = 17, ['s'] = 18, ['t'] = 19, ['u'] = 20, ['v'] = 21, ['w'] = 22, ['x'] = 23, ['y'] = 24, ['z'] = 25};

  assert(0 <= c && c <= sizeof(char_to_int)/sizeof(*char_to_int));
  int i = char_to_int[c];
  assert(i != 0 || c == 'a');
  return i;
}

char int_to_alpha_char(int i) {
  static constexpr char int_to_char[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};

  assert(0 <= i && i <= 25);
  return int_to_char[i];
}

#include <random>
#include <iostream>

int main() {
  std::random_device r;
  std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
  std::mt19937 m(seed);

  std::uniform_int_distribution<int> digits{0, 9};
  std::uniform_int_distribution<int> letters{0, 25};

  for (int i=0; i<20; ++i) {
    int a = digits(m);
    char b = int_to_digit_char(a);
    int c = digit_char_to_int(b);

    std::cout << a << " -> '" << b << "' -> " << c << '\n';
  }

  for (int i=0; i<20; ++i) {
    int a = letters(m);
    char b = int_to_alpha_char(a);
    int c = alpha_char_to_int(b);

    std::cout << a << " -> '" << b << "' -> " << c << '\n';
  }

}

There are two main ways to do this conversion: Lookup and Mathmatically

All ASCII values are denoted in decimal notion in this answer

Note that in ASCII: '0' is 48 , 'A' is 65 , and 'a' is 97

Lookup:

In the lookup version you have an array of char , and then place the mapped values in the array, and create an array of ints to convert back:

In order to both validate and get the corresponding value when mapping char to int :

0 will be a sentinal value to mean not mapped: out of range    
all results will be one more than expected

unsigned char is used to make sure a signed negative char is handled correctly

While 'C' allows the notation { ['A'] = 1, ['B'] = 2,… }; , C++ does not, so generically the following code can be used to fill lookup tables:

void fill_lookups(unsigned char * from_table, int from_size, int * to_table)
{
     for (int i = 0; i < from_size; ++i)
     {
         to_table[from_table[i]]=i+1; // add one to support 0 as "out of range"
     }
}

unsigned char int_to_char[]={ '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
unsigned char int_to_lower[]={'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
                     'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
                     'u', 'v', 'w', 'x', 'y', 'z'};
unsigned char int_to_upper[]={'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
                     'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
                     'U', 'V', 'W', 'X', 'Y', 'Z'};

int char_to_int[UCHAR_MAX+2] = {};       // This will return 0 for non digits
int letter_to_offset[UCHAR_MAX+2] = {};  // This will return 0 for non alpha

fill_lookups(int_to_char, sizeof(int_to_char), char_to_int);
fill_lookups(int_to_lower, sizeof(int_to_lower), letter_to_offset);
fill_lookups(int_to_upper, sizeof(int_to_upper), letter_to_offset);

// Helper function to check in range and always reduce in range lookups by 1
int to_int(int * table, unsigned char c, bool * in_range)
{
   int ret = table[c];
   if (ret)
   {
       *in_range=(1==1); // for C/C++ true
       --ret;
   }
   else
   {
       *in_range=(0==1); // for C/C++ false
   }

   return ret;
}

bool in_range;  // always true in these cases
int a=to_int(char_to_int, '7', &in_range); // a is now 7
char b=int_to_char[7]; // b is now '7'    
int c=to_int(letter_to_offset, 'C', &in_range); // c=2
int d=to_int(letter_to_offset, 'c', &in_range); // d=2
char e=int_to_upper[2]; // e='C'
char f=int_to_lower[2]; // f='c'

While this will work and if validation or other lookups are needed this might make sense, but...

In general a better way to do this is using mathmatic equations .

Mathmatically (alpha works for ASCII)

Assuming that the conversions have already been validated to be in the correct range: (C style cast used for use with C or C++)

Note that '0'-'9' are guarenteed to be consecutive in C and C++

For ASCII 'AZ' and 'az' are not only consecutive but 'A' % 32 and 'a' % 32 are both 1

int a='7'-'0';         // a is now 7 in ASCII: 55-48=7

char b=(char)7+'0';    // b is now '7' in ASCII: 7 + 48

int c='C' % 32 - 1;    // c is now 2 in ASCII : 67 % 32 = 3 - 1 = 2

-or- where we know it is uppercase

int c='C'-'A';         // c is now 2 in ASCII : 67 - 65 = 2


int d='c' % 32 - 1;    // d is now 2 in ASCII : 99 % 32 = 3 - 1 = 2

-or- where we know it is lowercase

int d='c'-'a';         // d is now 2 in ASCII : 99 - 97 = 2

char e=(char)2 + 'A';  // e is 'C' in ASCII : 65 + 2 = 67
char f=(char)2 + 'a';  // f is 'c' in ASCII : 97 + 2 = 99

If you know a character c is either a letter or number, you can just do:

int cton( char c )
{
  if( 'a' <= c ) return c-'a';
  if( 'A' <= c ) return c-'A';
  return c-'0';
}

Add whatever error checking on c is needed.

To convert an integer n back to a char , just do '0'+n if you want a digit, 'A'+n if you want an uppercase letter, and 'a'+n if you want lowercase.

Note: This works for ASCII (as the OP is tagged.) See Pete's informative comment however.

If I understand correctly, you want to do this:

#include <ctype.h>    /* for toupper */

int digit_from_char(char c) {
    return c - '0';
}

char char_from_digit(int d) {
    return d + '0';
}

int letter_from_char(char c) {
    return toupper(c) - 'A';
}

char char_from_letter(int l) {
    return l + 'A';
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM