简体   繁体   English

将std :: strings插入到std :: map中

[英]inserting std::strings in to a std::map

I have a program that reads data from a file line-by-line. 我有一个程序,可以逐行读取文件中的数据。 I would like to copy some substring of that line in to a map as below: 我想将该行的一些子字符串复制到地图中,如下所示:

std::map< DWORD, std::string > my_map;
DWORD index;         // populated with some data
char buffer[ 1024 ]; // populated with some data
char* element_begin; // points to some location in buffer
char* element_end;   // points to some location in buffer > element_begin

my_map.insert( std::make_pair( index, std::string( element_begin, element_end ) ) );

This std::map<>::insert() operation takes a long time (It doubles the file parsing time). 这个std::map<>::insert()操作需要很长时间(它将文件解析时间加倍)。 Is there a way to make this a less expensive operation? 有没有办法让这个更便宜的操作?

Thanks, PaulH 谢谢,PaulH

Edit: to be more specific, I want to know that I'm doing the minimum number of copy operations to get the data from the file in to the map. 编辑:更具体地说,我想知道我正在进行最少数量的复制操作,以便将文件中的数据传入地图。

Do you really need a map here? 你真的需要一张地图吗? As far as I can see in your example you only want to store an index as key value that is, as I suppose, simply incremented for each insertion. 就我在你的例子中看到的那样,你只想将索引存储为键值,正如我想的那样,只是为每次插入增加了一个值。 You could accomplish this with an std::vector which is know to be the fastest container. 你可以使用std::vector来完成这个任务,它被认为是最快的容器。 Just use push_back and access the value with at(index) . 只需使用push_back并使用at(index)访问该值。

There's a few things you could try. 你可以尝试一些事情。 There's overhead involved both in the data structure and the creation of the string itself. 数据结构和字符串本身的创建都涉及到开销。

  1. Does it need to be a map ? 它需要是一张map吗? You could try std::tr1::unordered_map instead and see if that helps. 您可以尝试使用std::tr1::unordered_map ,看看是否有帮助。

  2. How fast do lookups need to be? 查找需要多快? You could try std::vector if you can live with O(n) lookup time. 如果你可以使用O(n)查找时间,你可以尝试std::vector

  3. Do you need to store a copy of each substring? 您是否需要存储每个子字符串的副本? Could you just store a pointer instead? 你可以只存储一个指针吗?

Maybe you could try another version of the string constructor: 也许你可以尝试另一个版本的字符串构造函数:

string ( const char * s, size_t n );

If your implementation of string does not have a specialization for char * , it will be forced to traverse the range given and copy each character individually. 如果你的字符串实现没有char * ,它将被强制遍历给定的范围并单独复制每个字符。 In that case the constructor above might be faster (just a guess though). 在这种情况下,上面的构造函数可能更快(虽然只是猜测)。

To answer your supplementary question slightly. 稍微回答你的补充质询。 Try changing the map temporarily to a vector of strings, and then time it inserting a fixed string value into the vector For example: 尝试暂时将地图更改为字符串向量,然后将固定字符串值插入向量时计时例如:

vector <string> v;
string s( "foobar" );

your insert loop:
   v.push_back( s );

That should give you a lower bound of what is possible regarding speed. 这应该给你一个关于速度的可能性的下限。

Also, you should time things with all optimisations turned on (if you are not already doing so). 此外,您应该开启所有优化的时间(如果您还没有这样做)。 This can make a suprising difference to many Standard Library operations. 这可能会给许多标准库操作带来惊人的差异。

you are storing strings but I gues you already have read them and them add them to the map. 你正在存储字符串,但我猜你已经读过它们并将它们添加到地图中。 This will result in a copy. 这将导致副本。 If you store pointer to string in it (string* instead of string) will probably be faster. 如果将指针存储在其中的字符串(字符串*而不是字符串)可能会更快。

If your compiler isn't able to optimize away redundant copies in the insert, you can use the bracket operator to assign directly into the map: 如果编译器无法优化插入中的冗余副本,则可以使用括号运算符直接分配到地图中:

my_map[index].assign(element_begin, element_end)

Edit: As Neil points out this won't be helpful if there can be duplicate keys inserted. 编辑:正如尼尔指出,如果可以插入重复的密钥,这将没有用。

鉴于您需要将数据放入std::map<DWORD, std::string>然后,是的,您正在执行最少数量的复制操作以将数据放入映射中。

I believe most of your execution time with the map is copying strings. 我相信你使用地图的大部分执行时间都是复制字符串。 The std::map likes to have its own copy of everything. std::map喜欢拥有自己的所有副本。 So when you insert, the std::map makes a copy of the key and the value. 因此,当您插入时, std::map会生成密钥和值的副本。

Long time ago, when processors were slow and memory was small, programmers would use pointers to "large" data items and pass the pointer around rather than copying the data each time. 很久以前,当处理器速度慢且内存很小时,程序员会使用指向“大”数据项的指针并传递指针而不是每次都复制数据。 A pointer is a much smaller entity than a string and requires less execution time to copy. 指针是一个比字符串小得多的实体,需要较少的执行时间来复制。 Perhaps you should store pointers to strings in the map: 也许你应该在地图中存储指向字符串的指针:

#include <map>
#include <string>
#include "boost/shared_ptr.hpp"

typedef boost::shared_ptr<string>    Shared_Str_Ptr;

typedef std::map< DWORD, Shared_Str_Ptr> Map_Container;

//...
Map_Container my_map;
Shared_Str_Ptr p_str(new std::string("Hello"));
my_map[5] = p_str;

The shared_ptr will take care of memory management for you so there are no worries when deleting the map or its contents. shared_ptr将为您处理内存管理,因此在删除地图或其内容时无需担心。

See also Boost Smart Pointers . 另请参阅Boost Smart Pointers

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM