在C ++中將一串字節拆分為BYTES的向量

Question

我有一個字節字符串，如下所示：

"1,3,8,b,e,ff,10"

如何將此字符串拆分為包含以下值的BYTE的std :: vector：

[0x01,0x03,0x08,0x0b，0x0e，0xff，0x10]

我正在嘗試使用'，'作為分隔符來拆分字符串，但是我在使用它時遇到了一些麻煩。 有人可以幫助我解決這個問題嗎？

所以我試過這個：

    std::istringstream iss("1 3 8 b e ff 10");
    BYTE num = 0;
    while(iss >> num || !iss.eof()) 
    {
        if(iss.fail()) 
        {
            iss.clear();
            std::string dummy;
            iss >> dummy;
            continue;
        }
        dataValues.push_back(num);
    }

但是這會將ascii字節值推送到向量中：

49 //1
51 //3
56 //8
98 //b
101 //e
102 //f
102 //f
49 //1
48 //0

我試圖填充向量：

 0x01
 0x03
 0x08
 0x0b
 0x0e
 0xff
 0x10

Answer 1

您剛剛錯過了根據我的評論中的鏈接答案調整您的用例中出現的一些小問題：

    std::istringstream iss("1,3,8,b,e,ff,10");
    std::vector<BYTE> dataValues;

    unsigned int num = 0; // read an unsigned int in 1st place
                          // BYTE is just a typedef for unsigned char
    while(iss >> std::hex >> num || !iss.eof()) {
        if(iss.fail()) {
            iss.clear();
            char dummy;
            iss >> dummy; // use char as dummy if no whitespaces 
                          // may occur as delimiters
            continue;
        }
        if(num <= 0xff) {
            dataValues.push_back(static_cast<BYTE>(num));
        }
        else {
            // Error single byte value expected
        }
    }

您可以在ideone上看到完整工作的示例。

Answer 2

一個工作示例代碼（使用C ++ 11在GCC 4.9.0中測試）：

save.txt文件包含： 1,3,8,b,e,ff,10作為第一個唯一行。

輸出：

1
3
8
b
e
ff
10

這個想法是：

使用std :: getline逐行讀取。
使用boost :: split根據分隔符拆分線。
用戶std :: stringstream將十六進制字符串轉換為unsigned char。

碼：

#include <fstream>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
#include <boost/lexical_cast.hpp>

int main(int argc, char* argv[]) {
    std::ifstream ifs("e:\\save.txt");

    std::string line;
    std::vector<std::string> tokens;
    std::getline(ifs, line);
    boost::split(tokens, line, boost::is_any_of(","));

    std::vector<unsigned char> values;
    for (const auto& t : tokens) {
        unsigned int x;
        std::stringstream ss;
        ss << std::hex << t;
        ss >> x;

        values.push_back(x);
    }

    for (auto v : values) {
        std::cout << std::hex << (unsigned long)v << std::endl;
    }

    return 0;
}

Answer 3

只是為了演示另一種可能更快的做事方式，考慮將所有內容讀入數組並使用自定義迭代器進行轉換。

class ToHexIterator : public std::iterator<std::input_iterator_tag, int>{
    char* it_;
    char* end_;
    int current_;
    bool isHex(const char c){
        return (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f') || (c >= 'A' && c <= 'F');
    }
    char toUpperCase(const char c){
        if (c >= 'a' && c <= 'f'){
            return (c - 'a') + 'A';
        }
        return c;
    }
    int toNibble(const char c){
        auto x = toUpperCase(c);
        if (x >= '0' && x <= '9'){
            return x - '0';
        }
        else {
            return (x - 'A') + 10;
        }
    }
public:
    ToHexIterator() :it_{ nullptr }, end_{ nullptr }, current_{}{}                  //default constructed means end iterator
    ToHexIterator(char* begin, char* end) :it_{ begin }, end_{ end }, current_{}{
        while (!isHex(*it_) && it_ != end_){ ++it_; };  //make sure we are pointing to valid stuff
        ++(*this);
    }
    bool operator==(const ToHexIterator &other){
        return it_ == nullptr && end_ == nullptr && other.it_ == nullptr && other.end_ == nullptr;
    }
    bool operator!=(const ToHexIterator &other){
        return !(*this == other);
    }
    int operator*(){
        return current_;
    }
    ToHexIterator & operator++(){
        current_ = 0;
        if (it_ != end_) {
            while (isHex(*it_) && it_ != end_){
                current_ <<= 4;
                current_ += toNibble(*it_);
                ++it_;
            };
            while (!isHex(*it_) && it_ != end_){ ++it_; };
        }
        else {
            it_ = nullptr;
            end_ = nullptr;
        }
        return *this;
    }
    ToHexIterator operator++(int){
        ToHexIterator temp(*this);
        ++(*this);
        return temp;
    }
};

基本用例如下：

char in[] = "1,3,8,b,e,ff,10,--";
std::vector<int> v;
std::copy(ToHexIterator{ std::begin(in), std::end(in) }, ToHexIterator{}, std::back_inserter(v));

請注意，使用查找表執行ascii到hex半字節轉換可能會更快。

速度可能非常依賴於編譯器優化和平台，但是因為某些istringstream函數是作為虛函數或指向函數的指針實現的（取決於標准庫實現），優化器會遇到問題。 在我的代碼中沒有victuals或函數指針，唯一的循環是在優化器用於處理的std :: copy實現中。 它通常也更快地循環，直到兩個地址相等而不是循環，直到一些改變指針指向的東西等於某事。 在一天結束時，我所有的猜測和伏都教，但在我的機器上的MSVC13上快了大約10倍。 這是一個關於GCC的實際示例http://ideone.com/nuwu15 ，根據運行情況介於10x和3x之間，取決於首先進行的測試（可能是因為某些緩存效果）。

總而言之，毫無疑問會有更多的優化空間等等。任何在這個抽象層面上說“我的總是更快”的人都在賣蛇油。

更新：使用編譯時生成的查找表進一步提高速度： http ： //ideone.com/ady8GY （請注意，我增加了輸入字符串的大小以降低噪音，因此這與上面的示例無法直接比較）

在C ++中將一串字節拆分為BYTES的向量

問題描述

3 個解決方案

解決方案1
1 已采納 2014-07-24 19:54:20

解決方案2
0 2014-07-24 17:13:34

解決方案3
0 2014-07-24 18:12:23

在C ++中將一串字節拆分為BYTES的向量

問題描述

3 個解決方案

解決方案1 1 已采納 2014-07-24 19:54:20

解決方案2 0 2014-07-24 17:13:34

解決方案3 0 2014-07-24 18:12:23

解決方案1
1 已采納 2014-07-24 19:54:20

解決方案2
0 2014-07-24 17:13:34

解決方案3
0 2014-07-24 18:12:23