简体   繁体   English

将char向量转换为字符串的最有效方法

[英]most efficient way to convert char vector to string

I have these large pcap files of market tick data. 我有市场报价数据的这些大pcap文件。 On average they are 20gb each. 平均每个20gb。 The files are divided into packets. 这些文件分为数据包。 Packets are divided into a header and messages. 包分为报头和消息。 Messages are divided into a header and fields. 邮件分为标题和字段。 Fields are divided into a field code and field value. 字段分为字段代码和字段值。

I am reading the file a character at a time. 我一次读取文件中的一个字符。 I have a file reader class that reads the characters and passes the characters by const ref to 4 call back functions, on_packet_delimiter, on_header_char, on_message_delimiter, on_message_char. 我有一个文件读取器类,它读取字符并通过const ref将字符传递给4个回调函数,即on_packet_delimiter,on_header_char,on_message_delimiter,on_message_char。 The message object uses a similar function to construct its fields. 消息对象使用类似的功能来构造其字段。

Up to here I've noticed little loss of efficiency as compared to just reading the chars and not doing anything with them. 到目前为止,与只阅读字符而不对它们做任何事情相比,我注意到效率几乎没有损失。

The part of my code, where I'm processing the message header and extracting the instrument symbol of the message, slows down the process considerable. 我的代码部分(在其中处理消息头并提取消息的工具符号)大大减慢了该过程。

void message::add_char(const char& c)
{
  if (!message_header_complete) {
    if (is_first_char) {
      is_first_char = false;
      if (is_lower_case(c)) {
        first_prefix = c;
      } else {
        symbol_vector.push_back(c);
      }
    } else if (is_field_delimiter(c)) {
      on_message_header_complete();
      on_field_delimiter(c);
    } else {
      symbol_vector.push_back(c);
    }
  } else {
    // header complete, collect field information
    if (is_field_delimiter(c)) {
      on_field_delimiter(c);
    } else {
      fp->add_char(c);
    }
  }
}

...

void message::on_message_header_complete()
{
  message_header_complete =  true;
  symbol.assign(symbol_vector.begin(),symbol_vector.end());
}

...

In on_message_header_complete() I am feeding the chars to symbol_vector . on_message_header_complete()我将字符symbol_vector Once header is complete I convert to string using vector iterator. 标头完成后,我将使用矢量迭代器将其转换为字符串。 Is this the most efficient way to do this? 这是最有效的方法吗?

How about: 怎么样:

std::string myStr(myVec.begin(), myVec.end());

Although this works, I don't understand why you need to use vectors in the first place. 尽管这可行,但我不明白为什么首先需要使用向量。 Just use std::string from the beginning, and use myStr.append() to add characters or strings. 只需从一开始就使用std::string ,然后使用myStr.append()添加字符或字符串。

Here's an example: 这是一个例子:

std::string myStr = "abcd";
myStr.append(1,'e');
myStr.append(std::string("fghi"));
//now myStr is "abcdefghi"

In addition to The Quantum Physicist's answer: std::string should behave quite similar as vector does. 除了量子物理学家的答案:std :: string的行为应与向量相当相似。 Even the 'reserve' function is available in the string class, if you intend to use it for efficiency. 如果您打算使用它,则即使在字符串类中也可以使用“保留”功能。

Adding the characters is just as easy as it can get: 添加字符非常容易:

std::string s;
char c = 's';
s += c;

You could add the characters directly to your member, and you are fine. 您可以将字符直接添加到您的成员,就可以了。 But if you want to keep your member clean until the whole string is collected, you still should use a std::string object instead of the vector. 但是,如果要保持成员干净,直到收集到整个字符串,则仍应使用std :: string对象而不是vector。 You then add the characters to the temporary string and upon completion, you can swap the contents then. 然后将字符添加到临时字符串中,完成后可以交换内容。 No copying, just pointer exchange (and some additional data such as capacity and size...). 无需复制,只需交换指针(以及一些其他数据,例如容量和大小...)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM