I have a payload string, which I want to convert it into character array and then remove any non-ASCII characters from it. Here is my code:
bool invalidChar (char c)
{
return !(c>=0 && c <256);
}
void stripUnicode(string &str)
{
str.erase(remove_if(str.begin(),str.end(), invalidChar), str.end());
}
Payload_input is a string consisting of ascii and non-ascii characters:
stripUnicode(Payload_input) ;
char input[Payload_input.length()];
strcpy(input,Payload_input.c_str());
char chunk1[Payload_input.length()];
int counter1=0;
for(counter1=0; counter1< size; counter1++)
{
chunk1[counter1]=input[counter1];
}
Now, here is my string payload which I want to convert into char array:
--90B452BFFF3F395ABDC878D8BEDBD152
Content-Disposition: form-data; name="uploaddir"
language/2BB5B9330E/C/
--90B452BFFF3F395ABDC878D8BEDBD152
Content-Disposition: form-data; name="filename"; filename="lottery[1]20110727082525.jpg"
Content-Type: text/plain
Content-Transfer-Encoding: binary
JFIFddDucky<http://ns.adobe.com/xap/1.0/<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
In the above string, the few characters after Content-Transfer-Encoding: binary appears in blocks (inside bloacks it is written 0001 etc) on linux terminal.
When I try to print the characters (cout << chunk1[counter1]) after stripping non-ASCII chararcters from the string then even some ASCII characters get omitted after line Content-Transfer-Encoding: binary .
Please point it out if there is something wrong with my code?
The problem is that on Linux char
is always in the range -128-127, so your invalidChar
function will return true
for all the chars not strictly ASCII. If you want to check for extended ASCII (0-255) then your function is useless: every char
value is in the extended ASCII set; however, since char
is signed you need to check for negative values.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.