简体   繁体   中英

Trying to shift each bit in a string

Trying an encode program that will shift the ascii code in each character in a string and print out the the new character so that later I can shift left and decode a message.

example

"#" = 35 or 100011

100011 shifted left once = 1000110 or 70

Then I want to print "F".

This is what I have for code so far. I don't under stand the output. Not sure if it's because there is no code for a ascii character beyond 127.

#include <iostream>
#include <string>

using namespace std;

int main ()
{
    int i;

    string str ("Hello World");
    string encode, decode;


    for ( i=0; i<str.length(); ++i)
    {
        cout << str[i];
    }

    cout << endl << endl;

    for ( i=0; i<str.length(); ++i)
    {
        cout << (int) str[i] << " ";

    }

    cout << endl << endl;

    for ( i=0; i<str.length(); ++i)
    {
        encode[i] = (str[i] << 1) ;

        cout << encode[i]  << " ";
    }

    cout << endl << endl;

    return 0;
}

output:

Hello World

72 101 108 108 111 32 87 111 114 108 100 

\220 \312 \330 \330 \336 @ \256 \336 \344 \330 \310 

Unfortunately, OP didn't describe OS, and terminal where he tried in but I believe to know what happened and dare to write an answer.

I describe it for the first letter H . (It happens for all other as well.)

for ( i=0; i<str.length(); ++i)
{
    cout << str[i];
}

That's simple: std::ostream& operator <<(std::ostream&, char) is used and just prints H .

for ( i=0; i<str.length(); ++i)
{
    cout << (int) str[i] << " ";

}

The characters (type char ) are converted to int . (Cast is done first as its precedence is higher than the of operator<<() .) Hence, std::ostream& operator <<(std::ostream&, int) is used. As there are no I/O manipulators active, it just prints 72 – the decimal value of ASCII code H . (In C++, 'H' ( char constant) and 72 ( int constant) are simply two kinds to express a value of 72.)

for ( i=0; i<str.length(); ++i)
{
    encode[i] = (str[i] << 1) ;

    cout << encode[i]  << " ";
}

This is what happens in third loop:

  • str[i] provides a char .
  • operator<<() promotes the char to int as 1 is an int constant.
  • The operator<<() (in its original meaning "bit left shift") effectively multiplies the value of str[i] with 2, ie H (== 72) becomes 144.
  • The result is converted (clamped) to char when assigned to encode[i] .
  • The value of encode[i] is printed using std::ostream& operator <<(std::ostream&, char) (as in first loop).

Now, the things get misty as I don't know where the output is displayed on (and how). (Hence, my initial complaints about missing OS and such.)

However, I saw similar output when working in an xterm without UTF-8 support.

144 might be an unprintable character in the output console. (Standard ASCII describes only the characters with values 0 ... 127 and the first 32 as well as the last are control characters.) In this case, the code of the character is just printed as octal sequence (the same like the one accepted in C/C++ string literals).

Windows calculator: Dec 144 Oct outputs 220 .

Yepp. It matches the \\220 described by OP.


After thinking twice, I remembered that there are never single bytes with value >= 128 in UTF-8. Codepoints above 127 are always encoded with at least two values > 128. Hence, this output may/should happen in a terminal with UTF-8 support as well as the output simply doesn't form valid UTF-8 sequences.


Out of curiosity, I compiled and tested OP's program on coliru and got:

Hello World

72 101 108 108 111 32 87 111 114 108 100 

� � � � � @ � � � � � 

Live Demo on coliru

The s are probably placeholders for the characters representing invalid UTF-8 sequences. To check this, I made a counter example:

#include <iostream>

int main()
{
  std::cout << "\xc3\x9c\n";
  return 0;
}

where "\\xc3\\x9c" provides the UTF-8 encoded sequence for Ü .

Output:

Ü

Live Demo on coliru

So let's list out what you are trying to do:

  1. Get a string as input (ie Array of characters)
  2. Convert every character to integer, and then apply left shift and then store in another string, ie. encode which is again array of characters

So, now about the issue:

  1. You are bitshifting after converting to int, which is fine but after bit shifting you are trying to store it into array of characters where each character can be max-1byte and that still stores only characters from -128 to 127 after conversion to integer.

So, thats why it will never be able to store correct information as it exceeds the limit.

You can still store it as an integer like this:

encode[i] = ((int) str[i]) << 1 ;

But, the issue will be once it exceeds its limit, it will round back to -128, hence leaving you with the negative list of numbers as a result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM