[英]C++ vector of strings into associative vector of ints
Im having trouble to convert a string vector with size of ~ 1.0000.0000 elements to an associative vector with integers.我无法将大小为 ~ 1.0000.0000 个元素的字符串向量转换为带有整数的关联向量。
Input:输入:
std::vector<std::string> s {"a","b","a","a","c","d","a"};
Desired output:期望的输出:
std::vector<int> i {0,1,0,0,2,3,0};
I was thinking of an std::unordered_multiset
as mentioned in Associative Array with Vector in C++ but i can't get it running.我正在考虑使用 C++ 中的向量关联数组中提到的std::unordered_multiset
,但我无法让它运行。
The goal is to reduce the time it takes to convert c++ strings to R strings, which is so much faster if I just use numbers.目标是减少将 c++ 字符串转换为 R 字符串所需的时间,如果我只使用数字,速度会快得多。
Thank you for your help!感谢您的帮助!
Edit:编辑:
Thats how I tried to populate the set:这就是我尝试填充集合的方式:
for (size_t i = 0; i < s.size(); i++)
{
set.insert(s[i]);
}
If you need just the keys, why don't you use just a vector?如果您只需要密钥,为什么不只使用向量?
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int main()
{
std::vector<std::string> s {"a","b","a","a","c","d","a"};
std::vector<int> out(s.size());
std::transform(s.begin(), s.end(), out.begin(),[](auto& x)
{
return x[0] - 'a';
});
for(auto& i : out) std::cout << i << " ";
std::cout << std::endl;
return 0;
}
This code will output your desired output for your given input.此代码将为您的给定输入输出所需的输出。 And it will process 1.000.000 strings of length 3 in 0.4s.它将在 0.4 秒内处理 1.000.000 个长度为 3 的字符串。 So I think unordered_map is a viable choice.所以我认为 unordered_map 是一个可行的选择。
#include <string>
#include <iostream>
#include <unordered_map>
#include <chrono>
#include <random>
// generator function for creating a large number of strings.
std::vector<std::string> generate_strings(const std::size_t size, const std::size_t string_length)
{
static std::random_device rd{};
static std::default_random_engine generator{ rd() };
static std::uniform_int_distribution<int> distribution{ 'a', 'z' };
std::vector<std::string> strings;
std::string s(string_length, ' ');
for (std::size_t n = 0; n < size; n++)
{
for (std::size_t m = 0; m < string_length; ++m)
{
s[m] = static_cast<char>(distribution(generator));
}
strings.emplace_back(s);
}
return strings;
}
int main()
{
std::vector<std::string> strings = generate_strings(1000000, 3);
//std::vector<std::string> strings{ "a","b","a","a","c","d","a" };
std::unordered_map<std::string, int> map;
std::vector<int> output;
// speed optimization, allocate enough room for answer
// so output doesn't have to reallocate when growing.
output.reserve(strings.size());
auto start = std::chrono::high_resolution_clock::now();
int id = 0;
for (const auto& string : strings)
{
if (map.find(string) == map.end())
{
output.push_back(id);
map.insert({ string, id });
id++;
}
else
{
output.push_back(map.at(string));
}
}
auto duration = std::chrono::high_resolution_clock::now() - start;
auto nanoseconds = std::chrono::duration_cast<std::chrono::nanoseconds>(duration).count();
auto seconds = static_cast<double>(nanoseconds) / 1.0e9;
/*
for (const auto& value : output)
{
std::cout << value << " ";
}
*/
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.