简体   繁体   English

如何在需要旧式 unsigned char 的地方使用新的 std::byte 类型?

[英]How to use new std::byte type in places where old-style unsigned char is needed?

std::byte is a new type in C++17 which is made as enum class byte: unsigned char . std::byte是 C++17 中的一种新类型,它被制成enum class byte: unsigned char This makes impossible to use it without appropriate conversion.如果不进行适当的转换,就无法使用它。 So, I have made an alias for the vector of such type to represent a byte array:所以,我为这种类型的向量创建了一个别名来表示一个字节数组:

using Bytes = std::vector<std::byte>;

However, it is impossible to use it in old-style: the functions which accept it as a parameter fail because this type can not be easily converted to old std::vector<unsigned char> type, for example, a usage of zipper library:但是,不可能在旧式中使用它:接受它作为参数的函数会失败,因为这种类型不能轻易转换为旧的std::vector<unsigned char>类型,例如zipper库的用法:

/resourcecache/pakfile.cpp: In member function 'utils::Bytes resourcecache::PakFile::readFile(const string&)':
/resourcecache/pakfile.cpp:48:52: error: no matching function for call to 'zipper::Unzipper::extractEntryToMemory(const string&, utils::Bytes&)'
     unzipper_->extractEntryToMemory(fileName, bytes);
                                                    ^
In file included from /resourcecache/pakfile.hpp:13:0,
                 from /resourcecache/pakfile.cpp:1:
/projects/linux/../../thirdparty/zipper/zipper/unzipper.h:31:10: note: candidate: bool zipper::Unzipper::extractEntryToMemory(const string&, std::vector<unsigned char>&)
     bool extractEntryToMemory(const std::string& name, std::vector<unsigned char>& vec);
          ^~~~~~~~~~~~~~~~~~~~
/projects/linux/../../thirdparty/zipper/zipper/unzipper.h:31:10: note:   no known conversion for argument 2 from 'utils::Bytes {aka std::vector<std::byte>}' to 'std::vector<unsigned char>&'

I have tried to perform naive casts but this does not help also.我曾尝试执行天真的演员表,但这也无济于事。 So, if it is designed to be useful, will it be actually useful in old contexts?那么,如果它被设计成有用的,它在旧环境中是否真的有用? The only method I see is to use std::transform for using new vector of bytes in these places:我看到的唯一方法是使用std::transform在这些地方使用新的字节向量:

utils::Bytes bytes;
std::vector<unsigned char> rawBytes;
unzipper_->extractEntryToMemory(fileName, rawBytes);
std::transform(rawBytes.cbegin(),
               rawBytes.cend(),
               std::back_inserter(bytes),
               [](const unsigned char c) {
                   return static_cast<std::byte>(c);
               });
return bytes;

Which is:这是:

  1. Ugly.丑陋。
  2. Takes a lot of useless lines (can be rewritten but still it needs to be written before:)).占用了很多无用的行(可以重写,但仍然需要先写:))。
  3. Copies the memory instead of just using already created chunk of rawBytes .复制 memory 而不是仅仅使用已经创建的rawBytes块。

So, how to use it in old places?那么,老地方怎么用呢?

You're missing the point why std::byte was invented in the first place. 你错过了为什么std::byte被发明的原因。 The reason it was invented is to hold a raw byte in memory without the assumption that it's a character . 它被发明的原因是在存储器中保存一个原始字节而不假设它是一个字符 You can see that in cppreference . 你可以在cppreference中看到它。

Like char and unsigned char, it can be used to access raw memory occupied by other objects (object representation), but unlike those types, it is not a character type and is not an arithmetic type. 与char和unsigned char一样,它可以用于访问其他对象占用的原始内存(对象表示),但与这些类型不同,它不是字符类型,也不是算术类型。

Remember that C++ is a strongly typed language in the interest of safety (so implicit conversions are restricted in many cases). 请记住,为了安全起见,C ++是一种强类型语言(因此在许多情况下隐式转换受到限制)。 Meaning: If an implicit conversion from byte to char was possible, it would defeat the purpose. 含义:如果从bytechar的隐式转换是可能的,那么它将失败目的。

So, to answer your question: To use it, you have to cast it whenever you want to make an assignment to it: 所以,要回答你的问题:要使用它,你必须在你想要分配时使用它:

std::byte x = (std::byte)10;
std::byte y = (std::byte)'a';
std::cout << (int)x << std::endl;
std::cout << (char)y << std::endl;

Anything else shall not work, by design! 其他任何东西都不能按设计工作! So that transform is ugly, agreed, but if you want to store chars, then use char . 因此转换是丑陋的,同意,但如果你想存储字符,那么使用char Don't use bytes unless you want to store raw memory that should not be interpreted as char by default . 除非您希望存储默认情况下不应解释为char原始内存否则不要使用字节。

And also the last part of your question is generally incorrect: You don't have to make copies, because you don't have to copy the whole vector. 而且你问题的最后一部分通常也是错误的:你不必复制,因为你不必复制整个载体。 If you temporarily need to read a byte as a char , simply static_cast it at the place where you need to use it as a char . 如果您暂时需要将byte读取为char ,只需在需要将其用作char的位置进行static_cast It costs nothing, and is type-safe. 它没有任何成本,而且是类型安全的。


As to your question in the comment about casting std::vector<char> to std::vector<std::byte> , you can't do that. 关于将std::vector<char>std::vector<std::byte>的注释中的问题,你不能这样做。 But you can use the raw array underneath. 但是你可以使用下面的原始数组。 So, the following has a type (char*) : 所以,以下是一个类型(char*)

 std::vector<std::byte> bytes; // fill it... char* charBytes = reinterpret_cast<char*>(bytes.data()); 

This has type char* , which is a pointer to the first element of your array, and can be dereferenced without copying, as follows: 这个类型为char* ,它是指向数组第一个元素的指针,可以在不复制的情况下解除引用,如下所示:

 std::cout << charBytes[5] << std::endl; //6th element of the vector as char 

And the size you get from bytes.size() . 以及从bytes.size()获得的大小。 This is valid, since std::vector is contiguous in memory. 这是有效的,因为std::vector在内存中是连续的。 You can't generally do this with any other std container (deque, list, etc...). 你通常不能用任何其他std容器(deque,list等)来做这件事。

While this is valid, it removes part of the safety from the equation, keep that in mind. 虽然这是有效的,但它会从等式中消除部分安全性,请记住这一点。 If you need char , don't use byte . 如果需要char ,请不要使用byte

If you want something that behaves like a byte in the way you'd probably expect it but is named distinctly different from unsigned char use uint8_t from stdint.h. 如果你想要的东西就像你可能期望的那样行为,但命名与unsigned char明显不同,请使用stdint.h中的uint8_t。 For almost all implementations this will probably be a 对于几乎所有的实现,这可能是一个

typedef unsigned char uint8_t;

and again an unsigned char under the hood - but who cares? 再次成为引擎盖下的无符号字符 - 但是谁在乎呢? You just want to emphasize "This is not a character type". 你只想强调“这不是一个字符类型”。 You just don't have to expect to be able to have two overloads of some functions, one for unsigned char and one for uint8_t. 您不必期望能够对某些函数进行两次重载,一次用于unsigned char,另一次用于uint8_t。 But if you do the compiler will push your nose onto it anyway... 但是,如果你这样做,编译器会把你的鼻子推到它上面......

If your old-style code takes ranges or iterators as arguments, you can continue to use those. 如果旧样式代码将范围或迭代器作为参数,则可以继续使用它们。 In the few cases where you cannot (such as explicit range-based constructors), you could in theory write a new iterator class that wraps an iterator to unsigned char and converts *it to std::byte& . 在少数情况下你不能(比如explicit的基于范围的构造函数),你理论上可以编写一个新的迭代器类,它将迭代器包装到unsigned char并将*it转换为std::byte&

If you really want to do it and you're sure it's safe, you can use a pointer cast:如果你真的想这样做并且你确定它是安全的,你可以使用指针转换:

std::vector<std::byte> v;
void f(std::vector<unsigned char>& v);
f(*std::vector<unsigned char>*(&v));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM