简体   繁体   English

查找两个有序整数列表的乘积之和的目标和的算法

[英]Algorithm to find a target sum of the sum of products of two lists of ordered integers

I am at work trying to write a script to parse poorly converted tables in text-form that originates from parsed pdf:s into csv:s. 我在工作,试图编写一个脚本来解析文本格式转换不佳的表,该表源自已解析的pdf:s到csv:s。 Essentially the headers are lengths of planks, the data is the number of planks and finally the total length of all the planks in the row is given. 本质上,标题是木板的长度,数据是木板的数量,最后给出了该行中所有木板的总长度。

Simplified example 简化的例子

1,0   2,0   3,0   4,0 5,0   total M
1      3    2     1         17,0

Since the layout varies wildly and I don't need to guarantee correctness I think there's a decent chance that just trying all valid combinations of number of planks times lengths added together and see which ones sum correctly should work well enough. 由于布局千差万别,我不需要保证正确性,因此我认为有很大的机会尝试仅将所有有效数量的木板乘以长度再加上总长度,然后看看哪些总和正确即可。

As a proof of concept I want to write a simple program that takes two lists of integers and looks for all valid sums of products to see that I don't get a combinatorial nightmare. 作为概念证明,我想编写一个简单的程序,该程序使用两个整数列表,并查找所有有效的乘积之和,以查看我没有遇到组合梦night。

The rules for this toy problem then are. 那么这个玩具问题的规则就是。

Two lists of integers, the first [1..14], the second smallish integers (< 1000) and with 1 to 14 members. 两个整数列表,第一个[1..14],第二个较小的整数(<1000),成员数为1到14。 call them lengths and numPlanks 称它们为长度和数量

A target sum, which is found by summing the products of all the members of numPlanks with exactly one member of lengths and no two members of numPlanks can share a length. 目标总和,它是通过将numPlanks的所有成员的乘积与一个成员的长度恰好相加而得出的,并且numPlanks的任何两个成员都不能共享长度。 Searching through all such combinations and printing the combinations that matches the target. 搜索所有此类组合并打印与目标匹配的组合。

Further, the members of both lists are ordered. 此外,两个列表的成员都是有序的。 If the first element of numPlanks is multiplied with the second element of lenghts, no other member of numPlanks can be multiplied with first element of lengths. 如果将numPlanks的第一个元素与长度的第二个元素相乘,则numPlanks的其他成员不能与长度的第一个元素相乘。

Example, in pseudo-code 示例,用伪代码

lengths = [1, 2, 3, 4]
numPlanks = [10, 20]
target = 110

the program would then check 10 + 40, 10 + 60, 10 + 80, 20 + 60, 20 + 80, 30 + 80 to see which add up to the target and finally print out something like "10*30 + 20*40 = 110". 然后,程序会检查10 + 40、10 + 60、10 + 80、20 + 60、20 + 80、30 + 80,以找出哪个加到目标上,最后打印出类似“ 10 * 30 + 20 * 40 = 110英寸。

I've been trying to construct solutions but am stumped by only being able to think of nesting as many loops as there are members in numPlanks. 我一直在尝试构造解决方案,但由于只能考虑嵌套与numPlanks中的成员一样多的循环而感到困惑。 Which seems terrible. 这看起来很糟糕。

The program is written in java, so if anyone wants to point out anything language specific I'd be quite grateful, and anything else is of course fantastic as well. 该程序是用Java编写的,因此,如果有人想指出任何特定于语言的内容,我将不胜感激,其他任何事情当然也很棒。

Edit: sketching with pen and paper it seems the number of comparisons are related to Pascal's triangle. 编辑:用纸和笔素描似乎比较次数与帕斯卡的三角形有关。 Eg, for lengths with two members and numPlacks with 0 to 2 members the number of comparisons are 1,2,1 for 0, 1 and 2 members in numPlanks respectively. 例如,对于具有两个成员的长度和具有0到2个成员的numPlacks,对于numPlanks中的0、1和2个成员,比较数分别为1,2,1。

Given that I know that I have exactly 14 members in lengths in my actual problem and 1 to 14 members in numPlanks this would correspond to a worst case of 1716 comparisons. 假设我知道我的实际问题中的长度正好是14个成员,numPlanks中的长度是1到14个成员,这对应于1716个比较的最坏情况。 Which seems pretty ok. 看起来还可以。

Short answer: your estimate of the number of calculations is hopelessly optimistic, unless I'm misunderstanding your problem. 简短的答案:除非我误解了您的问题,否则您对计算次数的估计是无可救药的。

Suppose you have 14 elements in lengths and 14 elements in numPlanks . 假设您的lengths 14个元素,并且在numPlanks 14个元素。 Since each length and each numPlank can only be used once (if I understand correctly), then you basically have 14*14 = 196 possible terms, and you need to find some combination of them that add up to your target. 由于每个长度和每个numPlank只能使用一次(如果我理解正确),那么您基本上有14 * 14 = 196个可能的术语,并且您需要找到它们的某种组合以加总到目标中。

Suppose you start with a guess that the solution includes a particular length and a particular numPlanks. 假设您首先猜测解决方案包括特定长度和特定numPlanks。 That means you can cross off 13 other terms having the same numPlanks as your guess and 13 other terms having the same length as your guess. 这意味着您可以舍去与您的猜测具有相同numPlanks的其他13个术语,以及与您的猜测具有相同长度的其他13个术语。 And, of course, you can cross off the term you guessed. 而且,当然,您可以取消您猜到的术语。 That still leaves you with 169 terms to try to add to that guess. 这仍然给您留下169个术语,以试图增加您的猜测。

So, you pick on. 所以,你选择。 Now you cross off 12+13 more terms, like before, because they share a value with your guess at the 2nd term. 现在,您像以前一样删除了12 + 13个术语,因为它们与您在第二个术语中的猜测值相同。 Now, you've got 144 terms left... etc. 现在,您还剩下144个字...等等。

So, just to get all possible guesses of 3 terms, you have to look at 196*169*144 = 4.7 million possibilities. 因此,仅获得3个词的所有可能猜测,就必须查看196 * 169 * 144 = 470万种可能性。

Here's some Java code that generates solutions. 这是一些生成解决方案的Java代码。 This version has 14 elements in each array. 此版本在每个数组中都有14个元素。 It finds a solution after 64000 comparisons (far higher than your worst case estimate already). 经过64000次比较后,它找到了一个解决方案(已经远远超出您的最坏情况估计值)。 If you give it something unsolvable (eg, make all number lengths divisible by 10 and give it a target of 2051), then go get a cup of coffee and wait for the universe to end... 如果您给它一些不可解决的东西(例如,将所有数字长度都除以10并给它指定2051的目标),那么就去喝杯咖啡,等待宇宙结束...

public class Tester {

    static int numComparisons = 0;

    public static class Term {
        final int length;
        final int numPlanks;

        public Term(final int length, final int numPlanks) {
            this.length = length;
            this.numPlanks = numPlanks;
        }

        @Override
        public String toString() {
            return "(" + this.length + "*" + this.numPlanks + ")";
        }
    }

    public static List<Term> getTerms(int target, List<Integer> lengthsList,
            List<Integer> numPlanksList, List<Term> currentTermList) {
        // System.out.println("... " + target + ", " + lengthsList + ", " +
        // numPlanksList + ", " + currentTermList);
        lengthsLoop: for (int l : lengthsList) {
            numPlanksLoop: for (int n : numPlanksList) {
                numComparisons++;
                if (numComparisons % 100 == 0) {
                    System.out.println("... numComparisons = " + numComparisons
                            + " .... " + currentTermList);
                }
                if ((l * n) > target) {
                    continue lengthsLoop;
                } else if (l * n == target) {
                    final List<Term> newCurrentTermList = new ArrayList<Term>(
                            currentTermList);
                    newCurrentTermList.add(new Term(l, n));
                    return newCurrentTermList;
                } else {
                    final int newTarget = target - (l * n);
                    final List<Integer> newLengthsList = new ArrayList<Integer>(
                            lengthsList);
                    newLengthsList.remove((Integer) l);
                    final List<Integer> newNumPlanksList = new ArrayList<Integer>(
                            numPlanksList);
                    newNumPlanksList.remove((Integer) n);
                    final List<Term> newCurrentTermList = new ArrayList<Term>(
                            currentTermList);
                    newCurrentTermList.add(new Term(l, n));
                    final List<Term> answer = getTerms(newTarget,
                            newLengthsList, newNumPlanksList,
                            newCurrentTermList);
                    if (answer.size() > 0) {
                        return answer;
                    }
                }
            }
        }

        // System.out.println("Over!");
        return new ArrayList<Term>();

    }

    public static void main(String[] args) {

        List<Integer> lengthsList = new ArrayList<Integer>(Arrays.asList(1, 2,
                3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14));
        Collections.sort(lengthsList, Collections.reverseOrder());
        List<Integer> numPlanksList = new ArrayList<Integer>(Arrays.asList(1,
                20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140));
        Collections.sort(numPlanksList, Collections.reverseOrder());
        int target = 2051;

        final List<Term> finalAnswer = getTerms(target, lengthsList,
                numPlanksList, new ArrayList<Term>());
        if (finalAnswer.size() > 0) {
            System.out.println("Final Answer:");
            System.out.println("=============");
            for (Term t : finalAnswer) {
                System.out.println(t.length + "*" + t.numPlanks);
            }
        } else {
            System.out.println("No solution");
        }
        System.out.println("numComparisons = " + numComparisons);

    }
}

Because the integer arrays are ordered, this should be a quick solution. 因为整数数组是有序的,所以这应该是一个快速的解决方案。

Testing with 用测试

    int[] lengths = { 1, 2, 3, 4 };
    int[] plankCount = { 10, 20 };
    int totalPlankLength = 110;

I got the following result: 我得到以下结果:

    [10 x 3, 20 x 4]

Testing with 用测试

    int[] lengths = { 1, 2, 3, 4, 5, 6, 7 };
    int[] plankCount = { 10, 20, 30 };
    int totalPlankLength = 280;

I got the following results 我得到以下结果

    [10 x 1, 20 x 3, 30 x 7]
    [10 x 2, 20 x 4, 30 x 6]

Thanks to greybeard for the comment. 感谢greybeard的评论。 In a made up example, it's likely that more than one answer fits. 在一个虚构的示例中,可能有多个答案适合。 With the real data, it's less likely, but still possible. 有了真实数据,它的可能性较小,但仍然可能。

I use binary to help me create a set of the number of possibilities of plank count times the lengths. 我使用二进制文件来帮助我创建一组木板计数乘以长度的可能性。 There's nothing magic about binary, except that it solves the problem. 除了解决问题之外,二进制没有什么魔术。

Let me illustrate with the simpler example. 让我用一个简单的例子来说明。 We have the following input: 我们有以下输入:

    int[] lengths = { 1, 2, 3, 4 };
    int[] plankCount = { 10, 20 };
    int totalPlankLength = 110;

So, we need a way to get all the possible ways to multiply a plank count with a length. 因此,我们需要一种方法来获得所有可能的方法来将木板数乘以长度。

First, I calculated the number of possibilities by calculating 2 to the lengths length power. 首先,我通过计算长度长度乘方2来计算可能性的数量。 In this example, we calculate 2 to the 4th power, or 16. 在此示例中,我们将2乘以4的幂,即16。

Since we're using an int, the maximum length of the lengths List is 30. If you want a longer lengths List, you would have to convert the ints to longs. 由于我们使用的是int,所以lengths List的最大长度为30。如果要使用更长的List,则必须将int转换为longs。

We don't need to look at all of the binary numbers between 15 and 0. We just need to look at the binary numbers that have plankCount length one bits. 我们不需要查看15到0之间的所有二进制数。我们只需要查看具有长数量为plankCount的二进制数。 We look at the binary numbers in reverse order. 我们以相反的顺序查看二进制数。

12 1100
10 1010
 9 1001
 6 0110
 5 0101
 3 0011

The decimal numbers on the left don't matter. 左边的十进制数字无关紧要。 The bit patterns on the right are what matter. 右边的位模式很重要。 The set of bit patterns show the number of ways you can multiply the plankCount by the lengths. 一组位模式显示了您可以将plankCount与长度相乘的方式的数量。

So, we'll perform 6 multiplications with the two plank counts, for a total of 12 multiplications. 因此,我们将用两个木板计数执行6次乘法,总共12次乘法。 This happens quickly. 这很快发生。

I do the multiplications and sum the products for each binary pattern to see if the sum is equal to the total plank length. 我进行乘法运算,并对每个二进制模式的乘积求和,以查看总和是否等于木板总长度。 If so, I write out those multiplications. 如果是这样,我会写出这些乘法。

Here's the corrected code. 这是更正的代码。 Try it with 14 lengths, and see if it's fast enough for your needs. 尝试14种长度,看看是否足够快以满足您的需求。

package com.ggl.testing;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class BoardLength {

    public static void main(String[] args) {
        BoardLength boardLength = new BoardLength();
        int[] lengths = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
        int[] plankCount = { 10, 20, 30 };
        int totalPlankLength = 360;
        List<List<PlankLength>> plankLength = boardLength.calculatePlankLength(
                lengths, plankCount, totalPlankLength);
        displayResults(plankLength);
    }

    private static void displayResults(List<List<PlankLength>> plankLength) {
        if (plankLength.size() <= 0) {
            System.out.println("[]");
        } else {
            for (List<PlankLength> list : plankLength) {
                System.out.println(Arrays.toString(list.toArray()));
            }
        }
    }

    public List<List<PlankLength>> calculatePlankLength(int[] lengths,
            int[] plankCount, int totalPlankLength) {
        List<List<PlankLength>> plankLength = new ArrayList<>();
        String formatString = "%" + lengths.length + "s";

        int maximum = (int) Math.round(Math.pow(2D, (double) lengths.length));
        for (int index = maximum - 1; index >= 0; index--) {
            int bitCount = Integer.bitCount(index);
            if (bitCount == plankCount.length) {
                String bitString = String.format(formatString,
                        Integer.toBinaryString(index)).replace(' ', '0');
                calculateTotalPlankLength(lengths, plankCount,
                        totalPlankLength, plankLength, bitString);
            }
        }

        return plankLength;
    }

    private void calculateTotalPlankLength(int[] lengths, int[] plankCount,
            int totalPlankLength, List<List<PlankLength>> plankLength,
            String bitString) {
        List<PlankLength> tempList = new ArrayList<>();
        int plankIndex = 0;
        int sum = 0;
        for (int bitIndex = 0; bitIndex < bitString.length(); bitIndex++) {
            if (bitString.charAt(bitIndex) == '1') {
                PlankLength pl = new PlankLength(lengths[bitIndex],
                        plankCount[plankIndex++]);
                sum += pl.getPlankLength();
                tempList.add(pl);
            }
        }

        if (sum == totalPlankLength) {
            plankLength.add(tempList);
        }
    }

    public class PlankLength {
        private final int length;
        private final int plankCount;

        public PlankLength(int length, int plankCount) {
            this.length = length;
            this.plankCount = plankCount;
        }

        public int getLength() {
            return length;
        }

        public int getPlankCount() {
            return plankCount;
        }

        public int getPlankLength() {
            return length * plankCount;
        }

        @Override
        public String toString() {
            return "" + plankCount + " x " + length;
        }

    }

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM