[英]How to properly convert std::string to an integer vector
My high level goal is to convert any string (can include non-ascii characters) into a vector of integers by converting each character to integer.我的高级目标是通过将每个字符转换为整数来将任何字符串(可以包括非 ascii 字符)转换为整数向量。
I already have a python code snippet for this purpose:为此,我已经有一个 python 代码片段:
bytes = list(text.encode())
Now I want to have a C++ equivalent.现在我想要一个 C++ 等价物。 I tried something like我试过类似的东西
int main() {
char const* bytes = inputText.c_str();
long bytesLen = strlen(bytes);
auto vec = std::vector<long>(bytes, bytes + bytesLen);
for (auto number : vec) {
cout << number << endl;
}
return 0;
}
For an input string like "testΔ", the python code outputs [116, 101, 115, 116, 206, 148].对于像“testΔ”这样的输入字符串,python 代码输出 [116, 101, 115, 116, 206, 148]。
However C++ code outputs [116, 101, 115, 116, -50, -108].然而 C++ 代码输出 [116, 101, 115, 116, -50, -108]。
How should I change the C++ code to make them consistent?我应该如何更改 C++ 代码以使其一致?
However C++ code outputs [116, 101, 115, 116, -50, -108].然而 C++ 代码输出 [116, 101, 115, 116, -50, -108]。
In C++, the char
type is separate from both signed char
and unsigned char
, and it is unspecified whether or not it should be signed.在 C++ 中, char
类型与signed char
和unsigned char
是分开的,并且未指定是否应该有符号。
You thus explicitly want an unsigned char*
, but the .c_str
method gives you char *
, so you need to cast.因此,您明确需要一个unsigned char*
,但.c_str
方法为您提供char *
,因此您需要进行转换。 You will need reinterpret_cast
or a C-style cast;您将需要reinterpret_cast
或 C 风格的强制转换; static_cast
will not work . static_cast
将不起作用。
You can iterate over std::string
contents just fine, no need to convert it to std::vector
.您可以很好地迭代std::string
内容,无需将其转换为std::vector
。 Try this:尝试这个:
int main()
{
std::string str = "abc";
for (auto c : str)
{
std::cout << static_cast<unsigned int>(c) << std::endl;
}
}
static_cast
here is needed just because standard operator<<
outputs char
as it is, not as a number.这里需要static_cast
只是因为标准operator<<
输出char
原样,而不是数字。 Otherwise, you can work with it just like with any other integral type.否则,您可以像使用任何其他整数类型一样使用它。 We cast it to unsigned int
to ensure that output is strictly positive, for signedness of char
is implementation-defined.我们将其强制转换为unsigned int
以确保输出严格为正,因为char
unsigned int
是实现定义的。
How should I change the C++ code to make them consistent?我应该如何更改 C++ 代码以使其一致?
The difference appears to be that Python uses unsigned char values while char
is signed in your C++ implementation.不同之处似乎是 Python 使用无符号字符值,而 C++ 实现中的char
是有符号的。 One solution: Reinterpret the string as array of unsigned char
.一种解决方案:将字符串重新解释为unsigned char
数组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.