简体   繁体   English

用于提取密码破解者列表子集的组合算法

[英]Combinatorics algorithm for extracting subsets of lists for a password cracker

I must be rusty because I can't come up with a solution. 我一定很生锈,因为我无法提出解决方案。

Say we have 3 lists of words: 假设我们有3个单词列表:

list1   list2   list3
-----   -----   -----
pizza   red     child
pasta   green   man
apple   blue    adult
pear    yellow  old

I need to select subsets from each lists, such that: 我需要从每个列表中选择子集,例如:

  • The sum of all of the sections selected will return every possible combination from the whole list (eg pizza-red-child or pizza-red-man) 所有选定部分的总和将返回整个列表中的所有可能组合(例如,披萨红孩子或披萨红人)
  • There are no duplicates, so if selected section 1 contains one combination I don't want any other set to include it 没有重复项,因此如果选择的第1部分包含一个组合,则我不希望任何其他组合包含它
  • The selected sections need to have a certain minimum size (as defined as element count 1 * count 2 * etc) 选定的部分需要具有一定的最小大小(定义为元素计数1 *计数2 *等)
  • I need to have a minimum number of selected sections 我需要最少的选定部分

Now the trivial solution is of course, say you had to split this list in 4 for 4 workers (what I call a selected section above), just send every combination starting with pizza to worker 1, pasta to 2 and so on. 现在,简单的解决方案当然是了,比如说您必须将此列表分为4个工人4(我在上面称为“选定部分”),只需将每个披萨开头的组合发送给工人1,将意大利面发送给2,依此类推。 But that doesn't work if you have more workers than elements in your longest list, and things get complicated. 但是,如果您的工人比最长的列表中的元素多,那是行不通的,事情会变得复杂。

Edit - Example 编辑-示例

So the goal is given the list, find all combinations. 这样就给目标列表,找到所有组合。 But you need to split the main job into more machines. 但是您需要将主要工作拆分为更多机器。

The trivial solution explained above is, you have 4 elements in the longest list, just use 4 machines. 上面说明的简单解决方案是,最长的列表中有4个元素,只需使用4台机器即可。 In this case, it would look like this: 在这种情况下,它看起来像这样:

Machine 1: 机器1:

list1   list2   list3
-----   -----   -----
pizza   red     child
        green   man
        blue    adult
        yellow  old

Machine 2: 机器2:

list1   list2   list3
-----   -----   -----
        red     child
pasta   green   man
        blue    adult
        yellow  old

Machine 3: 机器3:

list1   list2   list3
-----   -----   -----
        red     child
        green   man
apple   blue    adult
        yellow  old

Machine 4: 机器4:

list1   list2   list3
-----   -----   -----
        red     child
        green   man
        blue    adult
pear    yellow  old

However this doesn't work if you have to split the work over more machines than the number of elements in the longest list. 但是,如果必须将工作分配到比最长列表中的元素数更多的机器上,则此方法将无效。 In that case, say you need to split the work over 8 machines (or 4 machines in two rounds per machine), it would have to look like this (I used 8 as it makes the example simpler, but the actual number is not that nice). 在这种情况下,假设您需要将工作拆分为8台机器(或每台机器两轮中的4台机器),则必须看起来像这样(我使用8台机器是为了简化示例,但实际数量并非如此)好)。

Machine 1: 机器1:

list1   list2   list3
-----   -----   -----
pizza   red     child
        green   man
                adult
                old

Machine 2: 机器2:

list1   list2   list3
-----   -----   -----
        red     child
pasta   green   man
                adult
                old

Machine 3: 机器3:

list1   list2   list3
-----   -----   -----
        red     child
        green   man
apple           adult
                old

Machine 4: 机器4:

list1   list2   list3
-----   -----   -----
        red     child
        green   man
                adult
pear            old

Machine 5: 机器5:

list1   list2   list3
-----   -----   -----
pizza           child
                man
        blue    adult
        yellow  old

Machine 6: 机器6:

list1   list2   list3
-----   -----   -----
                child
pasta           man
        blue    adult
        yellow  old

Machine 7: 机器7:

list1   list2   list3
-----   -----   -----
                child
                man
apple   blue    adult
        yellow  old

Machine 8: 机器8:

list1   list2   list3
-----   -----   -----
                child
                man
        blue    adult
pear    yellow  old

As you can see that's a way to split the original list whose max element is 4 into 8 machines. 如您所见,这是将最大元素为4的原始列表拆分为8台计算机的方法。 The question is, how to programmatically do that, when you can't control the number of machines/number of elements in the list? 问题是,当您无法控制列表中的计算机数量/元素数量时,如何以编程方式执行此操作?

if there was 1 worker, it would go in the order: 如果有1个工人,则按顺序排列:

pizza red child
pizza red man
pizza red adult
pizza red old
pizza green child
pizza green man
pizza green adult
pizza green old
pizza blue child
pizza blue man
pizza blue adult
pizza blue old
pizza yellow child
pizza yellow man
pizza yellow adult
pizza yellow old
pasta red child
pasta red man
pasta red adult
pasta red old
pasta green child
pasta green man
pasta green adult
pasta green old
pasta blue child
pasta blue man
pasta blue adult
pasta blue old
pasta yellow child
pasta yellow man
pasta yellow adult
pasta yellow old
apple red child
apple red man
apple red adult
apple red old
apple green child
apple green man
apple green adult
apple green old
apple blue child
apple blue man
apple blue adult
apple blue old
apple yellow child
apple yellow man
apple yellow adult
apple yellow old
pear red child
pear red man
pear red adult
pear red old
pear green child
pear green man
pear green adult
pear green old
pear blue child
pear blue man
pear blue adult
pear blue old
pear yellow child
pear yellow man
pear yellow adult
pear yellow old

If you have more workers, split by range. 如果您有更多工人,则按范围划分。 eg Worker1 gets "pizza red child" - "pizza blue child". 例如,Worker1获得“披萨红色孩子”-“披萨蓝色孩子”。 Worker 2 gets "pizza blue man" - "pasta red adult" , etc. 工人2获得“披萨蓝人”-“意大利面红大人”等

#include <vector>
#include <thread>
#include <cstdio>
using namespace std;

vector<vector<string>> lists = {{"apple", "pasta", "pear", "pizza"}, {"red", "green", "blue", "yellow"}, {"child", "man", "adult", "old"}};
const int K = 7;
long long N = 1;

std::vector<long long>  calc_vector(int k){
    long long remain_all = N;
    long long remain_cur = N * k / K;
    std::vector<long long>  ret;
    for(int i=0; i<lists.size(); ++i){
        long long sz = lists[i].size();
        long long i1 = remain_cur * sz / remain_all;
        ret.push_back(i1);
        remain_all /= sz;
        remain_cur -= remain_all * i1;
    }
    return ret;
}


void work(int k){
    auto v1 = calc_vector(k);
    auto v2 = calc_vector(k+1);
    while(v1 != v2){
        printf("%d: %s-%s-%s\n", k, lists[0][v1[0]].c_str(), lists[1][v1[1]].c_str(), lists[2][v1[2]].c_str());
        for(int i=v1.size()-1; i>=0; --i){
            v1[i]++;
            if(v1[i] != lists[i].size() || i==0) break;
            else v1[i] = 0;
        }
    }
}

int main(){
    for(auto &list : lists) N *= list.size();
    vector<thread> threads;
    for(int i=0; i<K; ++i) threads.push_back(thread(work, i));
    for(auto &thread : threads) thread.join();
    return 0;
}

If I get it right, maybe you can try replacing elements that you pick in your selected sections. 如果我做对了,也许您可​​以尝试替换在所选部分中选择的元素。 For example; 例如;

pizza - red - child 披萨-红色-儿童

then; 然后;

pasta - red - child 意大利面-红色-儿童

. . .

and so on... 等等...

so instead of creating new selected sections for every possible combination, you can try to manipulate one selected section for every possible combination once you are done with it. 因此,您可以尝试为每种可能的组合操作一个选定的部分,而不是为每种可能的组合创建新的选定部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM