简体   繁体   中英

Bit manipulation on character string

Can we apply bit manipulation on a character string? If so, is it always possible to retrieve back a character string from the manipulated string?

I was hoping to use the XOR operator on two strings by converting them to binary and then back to character string .

I took up some code from another StackOverflow question but it only solves half the problem

std::string TextToBinaryString(string words) 
{
string binaryString = "";
for (char& _char : words) 
    {
        binaryString +=std::bitset<8>(_char).to_string();
    }
return binaryString;
}

I don't know how to convert this string of ones and zeroes back to a string of characters. I did read std::stio in some google search results as a solution but was not able to understand them.

The manipulation that I wish to do is

std::string message("Hello World");
int n = message.size();
bin_string = TextToBinaryString(message)

std::string left,right;
bin_string.copy(left,n/2,0);
bin_string.copy(right,n,n/2);

std::string result = left^right;

I know I can hardcode this by picking up every entry and applying the operation but it is the conversion of the binary string back to characters that are making me scratch my head.

*EDIT: *I am trying to implement a cipher framework called Feistel cipher ( SORRY, should had made that clear before ) there they use the property of XOR that when you XOR something with the same thing again it cancels out... For eg. (A^B)^B=A. I wanted to output the ciphered jibberish in the middle. Hence, the query.

Can we apply bit manipulation on a character string?

Yes.

A character is an integer type, so you can do anything to them you can do to any other integer. What happened when you tried ?

If so, is it always possible to retrieve back a character string from the manipulated string?

No. It is sometimes possible to recover the original string, but some manipulations are not reversible.

XOR, the particular operation you asked about, is self-reversing, so it works in that case but not in general.

A cheesy example (depends on ASCII character set, don't do this in real code for converting case, etc. etc.)

#include <iostream>
#include <string>

int main() {
    std::string s("a");
    std::cout << "original: " << s << '\n';
    s[0] ^= 0x20;
    std::cout << "modified: " << s << '\n';
    s[0] ^= 0x20;
    std::cout << "restored: " << s << '\n';
}

shows (on an ASCII-compatible) system

original: a
modified: A
restored: a

Note that I'm not converting "a" into "1100001" first, and then using XOR (somehow) zero bit 5 giving "1000001" and then converting that back into "A". Why would I?

This part of your question suggests you don't understand the difference between values and representations: the character is always stored in binary. You can also always treat it as if it is stored in octal, or in decimal, or in hexadecimal - the choice of base only affects how we write (or print) the value, and not what the value is in itself.


Writing a Feistel cipher where the plaintext and key are the same length is trivial:

std::string feistel(std::string const &text, std::string const &key)
{
    std::string result;
    std::transform(text.begin(), text.end(), key.begin(),
                   std::back_inserter(result),
                   [](char a, char b) { return a^b; }
                   );
    return result;
}

This doesn't work at all if the key is shorter, though - looping round the key appropriately is left as an exercise for the reader.

Oh, and printing the encoded string is unlikely to work nicely (unless your key is helpfully just a sequence of space characters, as above).

You probably want something like this:

#include<string>
#include<cassert>

using namespace std;

std::string someBitmanipulation(string words)
{
  std::string manipulatedstring;

  for (char& thechar : words)
  {
    thechar ^= 0x5A;  // xor with 0x5A
  }
  return manipulatedstring;
}

int main()
{
  std::string original{ "ABC" };
  // xor each char of original with 0x5a at put result into manipulated
  auto manipulated = someBitmanipulation(original);

  // check if manipulating the manipulated string is the same as the original string
  assert(original == someBitmanipulation(manipulated));
}

You don't need std::bitset at all.

Now change thechar ^= 0x5A; to say thechar |= 0x5A; and see what happens.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM