可调整大小的char缓冲区容器类型，用于C ++

Question

I'm using libcurl (HTTP transfer library) with C++ and trying to download files from remote HTTP servers. 我在C ++中使用libcurl（HTTP传输库），并试图从远程HTTP服务器下载文件。 As file is downloaded, my callback function is called multiple times (eg every 10 kb) to send me buffer data. 下载文件时，将多次调用我的回调函数（例如，每10 kb）以向我发送缓冲区数据。

Basically I need something like "string bufer", a data structure to append char buffer to existing string. 基本上，我需要像“字符串缓冲区”之类的东西，一种将char缓冲区附加到现有字符串的数据结构。 In C, I allocate ( malloc ) a char* and then as new buffers come, I realloc and then memcpy so that I can easily copy my buffer to resized array. 在C中，我分配（ malloc ）一个char* ，然后作为新的缓冲区来，我realloc然后memcpy ，这样我可以我的缓冲器容易复制到调整大小的阵列。

In C, there are multiple solutions to achieve this. 在C中，有多种解决方案可以实现此目的。

I can keep using malloc , realloc , memcpy but I'm pretty sure that they are not recommended in C++. 我可以继续使用malloc ， realloc和memcpy但是我很确定在C ++中不推荐使用它们。
I can use vector<char> . 我可以使用vector<char> 。
I can use stringstream . 我可以使用stringstream 。

My use cases is, I'll append a few thousands of items ( char s) at a time, and after it all finishes (download is completed), I will read all of it at once. 我的用例是，我将一次附加数千个项目（ char ），所有操作完成（下载完成）后，我将一次阅读所有内容。 But I may need options like seek in the future (easy to achieve in array solution (1)) but it is low priority now. 但是我将来可能需要诸如seek （在阵列解决方案（1）中易于实现）之类的选项，但是现在它的优先级较低。

What should I use? 我应该使用什么？

Answer 1

I'd go for stringstream . 我会去找stringstream 。 Just insert into it as you recieve the data, and when you're done you can extract a full std::string from it. 只需在接收数据时插入其中，完成后就可以从中提取完整的std::string 。 I don't see why you'd want to seek into an array? 我不明白您为什么要seek数组？ Anyway, if you know the block size, you can calculate where in the string the corresponding block went. 无论如何，如果您知道块的大小，则可以计算出相应块在字符串中的位置。

Answer 2

I'm not sure if many will agree with this, but for that use case I would actually use a linked list, with each node containing an arbitrarily large array of char that were allocated using new . 我不确定是否会同意这一点，但是对于该用例，我实际上将使用链表，每个节点包含使用new分配的任意大字符数组。 My reasoning being: 我的理由是：

Items are added in large chunks at a time, one at a time at the back. 项一次添加一大块，一次添加一次。
I assume this could use quite a large amount of space, so you avoid reallocation events when a vector would otherwise need more space. 我认为这会占用大量空间，因此当向量需要更多空间时，您可以避免发生重新分配事件。
Since items are read sequentially, the penalty of link lists being unidirectional doesn't affect you. 由于项目是按顺序读取的，因此单向链接列表的影响不会影响您。

Should Seeking through the list become a priority, this wouldn't work though. 应该将搜索列表作为优先事项，但这将不起作用。 If it's not a lot of data ultimately, I honestly think a vector would be fine, dispite not being the most efficient structure. 如果不是大量的数据，最终，老实说，我认为一个载体就可以了，dispite不是最有效的结构。

Answer 3

If you just need to append char buffers, you can also simply use std::string and the member function append . 如果只需要追加char缓冲区，也可以简单地使用std::string和成员函数append 。 On top of that stringstream gives you formatting, functionality, so you can add numbers, padding etc., but from your description you appear not to need that. 在该stringstream之上，您可以设置格式，功能，因此可以添加数字，填充等，但是从您的描述看来，您不需要这样做。

Answer 4

I think I'd use a deque<char> . 我想我会使用deque<char> 。 Same interface as vector , and vector would do, but vector needs to copy the whole data each time an append exceeds its existing capacity. 与vector相同的接口，并且vector可以做到，但是每次追加超过其现有容量时，vector需要复制整个数据。 Growth is exponential, but you'd still expect about log N reallocations, where N is the number of equal-sized blocks of data you append. 增长是指数级的，但您仍然希望获得约log N重新分配，其中N是要追加的大小相等的数据块的数量。 Deque doesn't reallocate, so it's the container of choice in cases where a vector would need to reallocate several times. Deque不会重新分配，因此它是向量需要多次分配的情况下的首选容器。

Assuming the callback is handed a char* buffer and length, the code to copy and append the data is simple enough: 假设将回调传递给char*缓冲区和长度，那么复制和追加数据的代码就足够简单了：

mydeque.insert(mydeque.end(), buf, buf + len);

To get a string at the end, if you want one: 要在最后得到一个字符串，如果需要的话：

std::string mystring(mydeque.begin(), mydeque.end());

I'm not exactly sure what you mean by seek , but obviously deque can be accessed by index or iterator, same as vector . 我不太清楚你的意思通过seek ，但很明显， deque可以通过索引或迭代器，一样被访问vector 。

Another possibility, though, is that if you expect a content-length at the start of the download, you could use a vector and reserve() enough space for the data before you start, which avoids reallocation. 但是，另一种可能性是，如果您期望下载开始时的内容长度，则可以在开始之前使用vector和reserve()足够的数据空间，这样可以避免重新分配。 That depends on what HTTP requests you're making, and to what servers, since some HTTP responses will use chunked encoding and won't provide the size up front. 这取决于您要发出的HTTP请求和服务器，因为某些HTTP响应将使用分块编码，并且不会预先提供大小。

Answer 5

I would use vector<char> . 我会使用vector<char> 。 But they will all work even with a seek, so your question is really one of style and there are no definitive answers there. 但是它们即使搜索也可以工作，因此您的问题确实是一种风格，那里没有明确的答案。

Answer 6

Create your own Buffer class to abstract away the details of the storage. 创建自己的Buffer类以抽象出存储的详细信息。 If I were you I would likely implement the buffer based on std::vector<char> . 如果我是我，我可能会基于std::vector<char>实现缓冲区。

可调整大小的char缓冲区容器类型，用于C ++

问题描述

6 个解决方案

解决方案1
1 已采纳 2011-08-23 06:04:08

解决方案2
1 2011-08-23 06:23:07

解决方案3
1 2011-08-23 06:59:02

解决方案4
1 2011-08-23 09:05:08

解决方案5
0 2011-08-23 06:02:49

解决方案6
0 2011-08-23 06:47:26

可调整大小的char缓冲区容器类型，用于C ++

问题描述

6 个解决方案

解决方案1 1 已采纳 2011-08-23 06:04:08

解决方案2 1 2011-08-23 06:23:07

解决方案3 1 2011-08-23 06:59:02

解决方案4 1 2011-08-23 09:05:08

解决方案5 0 2011-08-23 06:02:49

解决方案6 0 2011-08-23 06:47:26

解决方案1
1 已采纳 2011-08-23 06:04:08

解决方案2
1 2011-08-23 06:23:07

解决方案3
1 2011-08-23 06:59:02

解决方案4
1 2011-08-23 09:05:08

解决方案5
0 2011-08-23 06:02:49

解决方案6
0 2011-08-23 06:47:26