简体   繁体   English

可调整大小的char缓冲区容器类型,用于C ++

[英]Resizable char buffer container type for C++

I'm using libcurl (HTTP transfer library) with C++ and trying to download files from remote HTTP servers. 我在C ++中使用libcurl(HTTP传输库),并试图从远程HTTP服务器下载文件。 As file is downloaded, my callback function is called multiple times (eg every 10 kb) to send me buffer data. 下载文件时,将多次调用我的回调函数(例如,每10 kb)以向我发送缓冲区数据。

Basically I need something like "string bufer", a data structure to append char buffer to existing string. 基本上,我需要像“字符串缓冲区”之类的东西,一种将char缓冲区附加到现有字符串的数据结构。 In C, I allocate ( malloc ) a char* and then as new buffers come, I realloc and then memcpy so that I can easily copy my buffer to resized array. 在C中,我分配( malloc )一个char* ,然后作为新的缓冲区来,我realloc然后memcpy ,这样我可以我的缓冲器容易复制到调整大小的阵列。

In C, there are multiple solutions to achieve this. 在C中,有多种解决方案可以实现此目的。

  1. I can keep using malloc , realloc , memcpy but I'm pretty sure that they are not recommended in C++. 我可以继续使用mallocreallocmemcpy但是我很确定在C ++中不推荐使用它们。
  2. I can use vector<char> . 我可以使用vector<char>
  3. I can use stringstream . 我可以使用stringstream

My use cases is, I'll append a few thousands of items ( char s) at a time, and after it all finishes (download is completed), I will read all of it at once. 我的用例是,我将一次附加数千个项目( char ),所有操作完成(下载完成)后,我将一次阅读所有内容。 But I may need options like seek in the future (easy to achieve in array solution (1)) but it is low priority now. 但是我将来可能需要诸如seek (在阵列解决方案(1)中易于实现)之类的选项,但是现在它的优先级较低。

What should I use? 我应该使用什么?

I'd go for stringstream . 我会去找stringstream Just insert into it as you recieve the data, and when you're done you can extract a full std::string from it. 只需在接收数据时插入其中,完成后就可以从中提取完整的std::string I don't see why you'd want to seek into an array? 我不明白您为什么要seek数组? Anyway, if you know the block size, you can calculate where in the string the corresponding block went. 无论如何,如果您知道块的大小,则可以计算出相应块在字符串中的位置。

I'm not sure if many will agree with this, but for that use case I would actually use a linked list, with each node containing an arbitrarily large array of char that were allocated using new . 我不确定是否会同意这一点,但是对于该用例,我实际上将使用链表,每个节点包含使用new分配的任意大字符数组。 My reasoning being: 我的理由是:

  • Items are added in large chunks at a time, one at a time at the back. 项一次添加一大块,一次添加一次。
  • I assume this could use quite a large amount of space, so you avoid reallocation events when a vector would otherwise need more space. 我认为这会占用大量空间,因此当向量需要更多空间时,您可以避免发生重新分配事件。
  • Since items are read sequentially, the penalty of link lists being unidirectional doesn't affect you. 由于项目是按顺序读取的,因此单向链接列表的影响不会影响您。

Should Seeking through the list become a priority, this wouldn't work though. 应该将搜索列表作为优先事项,但这将不起作用。 If it's not a lot of data ultimately, I honestly think a vector would be fine, dispite not being the most efficient structure. 如果不是大量的数据,最终,老实说,我认为一个载体就可以了,dispite不是最有效结构。

If you just need to append char buffers, you can also simply use std::string and the member function append . 如果只需要追加char缓冲区,也可以简单地使用std::string和成员函数append On top of that stringstream gives you formatting, functionality, so you can add numbers, padding etc., but from your description you appear not to need that. 在该stringstream之上,您可以设置格式,功能,因此可以添加数字,填充等,但是从您的描述看来,您不需要这样做。

I think I'd use a deque<char> . 我想我会使用deque<char> Same interface as vector , and vector would do, but vector needs to copy the whole data each time an append exceeds its existing capacity. vector相同的接口,并且vector可以做到,但是每次追加超过其现有容量时,vector需要复制整个数据。 Growth is exponential, but you'd still expect about log N reallocations, where N is the number of equal-sized blocks of data you append. 增长是指数级的,但您仍然希望获得约log N重新分配,其中N是要追加的大小相等的数据块的数量。 Deque doesn't reallocate, so it's the container of choice in cases where a vector would need to reallocate several times. Deque不会重新分配,因此它是向量需要多次分配的情况下的首选容器。

Assuming the callback is handed a char* buffer and length, the code to copy and append the data is simple enough: 假设将回调传递给char*缓冲区和长度,那么复制和追加数据的代码就足够简单了:

mydeque.insert(mydeque.end(), buf, buf + len);

To get a string at the end, if you want one: 要在最后得到一个字符串,如果需要的话:

std::string mystring(mydeque.begin(), mydeque.end());

I'm not exactly sure what you mean by seek , but obviously deque can be accessed by index or iterator, same as vector . 我不太清楚你的意思通过seek ,但很明显, deque可以通过索引或迭代器,一样被访问vector

Another possibility, though, is that if you expect a content-length at the start of the download, you could use a vector and reserve() enough space for the data before you start, which avoids reallocation. 但是,另一种可能性是,如果您期望下载开始时的内容长度,则可以在开始之前使用vectorreserve()足够的数据空间,这样可以避免重新分配。 That depends on what HTTP requests you're making, and to what servers, since some HTTP responses will use chunked encoding and won't provide the size up front. 这取决于您要发出的HTTP请求和服务器,因为某些HTTP响应将使用分块编码,并且不会预先提供大小。

I would use vector<char> . 我会使用vector<char> But they will all work even with a seek, so your question is really one of style and there are no definitive answers there. 但是它们即使搜索也可以工作,因此您的问题确实是一种风格,那里没有明确的答案。

Create your own Buffer class to abstract away the details of the storage. 创建自己的Buffer类以抽象出存储的详细信息。 If I were you I would likely implement the buffer based on std::vector<char> . 如果我是我,我可能会基于std::vector<char>实现缓冲区。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM