C ++中的GPU加速递归函数

Question

I'm quite new to C++(two days actually) and I would like to know if it is possible to do some parallelization with this code. 我对C ++很陌生（实际上是两天），我想知道是否可以对这段代码进行并行化。 I need this to be a hell lot faster as there are millions of iterations. 我需要这要快得多，因为有数百万次迭代。 From what I've understood so far it is not possible to parallelize, as the only for loop I use depends on the iteration before, which doesn't allow parallelization. 从到目前为止的理解来看，不可能并行化，因为我使用的唯一for循环取决于之前的迭代，而不允许迭代。 Right? 对？ And if parallelization is not possible, how to optimize it otherwise so it gets faster. 如果无法并行化，则如何进行优化，以使其更快。 I was quite surprised as this only runs 3x faster than my original python code. 我很惊讶，因为它的运行速度仅比原始python代码快3倍。 (some said C++ is up to 100 to 400x faster than python) （有人说C ++比python快100到400倍）

If the VisualStudio 2015 project files are needed, please tell me.. 如果需要VisualStudio 2015项目文件，请告诉我。

If you run the application: You need to enter a sha1 hash and then tell the programm how many characters the base word had, so for example the word test: hash: a94a8fe5ccb19ba61c4c0873d391e987982fbbd3 length: 4 如果运行该应用程序：您需要输入sha1哈希，然后告诉程序基本单词有多少个字符，例如单词test：哈希：a94a8fe5ccb19ba61c4c0873d391e987982fbbd3长度：4

Thanks in advice 谢谢建议

#include "stdafx.h"
#include <stdio.h>
#include <string>
#include <iostream>
#include <cstring>
#include "..\crytoPP\sha.h"
#include "..\crytoPP\filters.h"
#include "..\crytoPP\hex.h"
#include "..\crytoPP\channels.h"

using namespace CryptoPP;
using namespace  std;

int found = 0;
int iteration = 0;
int length;
char source[] = "abcdefghijklmnopqrstuvwxyz";
string solution = " didn't match";
string base_hash;

string CHECK(string hash, int argc, char** argv);
void COMBINATIONS(string b, int length, int source_length, int argc, char** argv);

int main(int argc, char** argv)
{
    char *arr_ptr = &source[0];
    int source_length = strlen(arr_ptr);
    cout << "Please enter hash:";
    cin >> base_hash;
    cout << "Please enter length:";
    cin >> length;
    transform(base_hash.begin(), base_hash.end(), base_hash.begin(), ::toupper);
    COMBINATIONS("", ::length, source_length, argc - 1, argv + 1);
    system("PAUSE");
    return 0;
}

string CHECK(string hash, int argc, char** argv) {
    if (::found == 0) {
        iteration++;
        cout << iteration << endl;
        if (argc == 2 && argv[1] != NULL)
            hash = string(argv[1]);
        string s1;
        SHA1 sha1; SHA224 sha224; SHA256 sha256; SHA512 sha512;
        HashFilter f1(sha1, new HexEncoder(new StringSink(s1)));
        ChannelSwitch cs;
        cs.AddDefaultRoute(f1);
        StringSource ss(hash, true /*pumpAll*/, new Redirector(cs));
        cout << s1 << endl;
        if (s1 == ::base_hash) {
            ::found = 1;
            cout << " =" << hash << endl;
        }
        return s1;
    }  
}

void COMBINATIONS(string b, int length, int source_length, int argc, char** argv) {
    if (::found == 0) {
        if (length == 0) {
            CHECK(b, argc, argv);
        }
        else {
            for (int i = 0; i < source_length; i++) {
                COMBINATIONS(b + ::source[i], length -1, source_length, argc -1, argv + 1 );
            CHECK(b, argc - 1, argv + 1);
            }
        }
    }
}

Answer 1

The first thing you should try is to remove your output in every iteration as this does significantly reduce the performance of your program. 您应该尝试的第一件事是在每次迭代中都删除输出，因为这确实会大大降低程序的性能。

Right now you are invoking COMBINATIONS only once with an empty string b , but if you would create one thread for every starting string b of size 1 in main you can have eg 26 threads each solving an equally sized part of the problem. 现在，您只用空字符串b调用一次COMBINATIONS ，但是如果您要为main每个大小为1的起始字符串b创建一个线程，则可以有26个线程，每个线程解决问题的大小相等。 Yet, best would be to try to rewrite the COMBINATIONS function to be better suited for parallelism. 但是，最好是尝试重写COMBINATIONS函数，使其更适合于并行性。

Moreover you are currently leaking memory every time you call CHECK which right now might not seem as much as a problem, however the longer the word you are looking for, the more memory your program will require. 此外，您当前每次调用CHECK时都在泄漏内存，现在看来这似乎不成问题，但是您要寻找的单词越长，程序所需的内存就越多。 C++ requires you to manage the memory yourself, so you should at least free all the memory you allocated with new by using delete (to make sure it can be reused). C ++要求您自己管理内存，因此至少应使用delete用new分配的所有内存（以确保可以重用）。 Even better if you would try to reuse those objects you created as memory allocations are somewhat slow as well. 如果您尝试重用创建的那些对象，那就更好了，因为内存分配也有些慢。

Last but not least please rethink the purpose of incrementing/decrementing argc and argv . 最后但并非最不重要的一点是，请重新考虑增加/减少argc和argv的目的。 Frankly I do not quite understand your intention there and it seems evil. 坦白说，我在那里不太了解你的意图，这似乎很邪恶。

C ++中的GPU加速递归函数

问题描述

1 个解决方案

解决方案1
0 已采纳

C ++中的GPU加速递归函数

问题描述

1 个解决方案

解决方案1 0 已采纳

解决方案1
0 已采纳