Cannot locate my mistake in an algorithm involving Pythagorean triples and sets

Question

I'm trying to find the answer to "Given that L is the length of the wire, for how many values of L ≤ 1,500,000 can exactly one integer sided right angle triangle be formed?", Project Euler #75 .

I'm not asking for the correct answer, or the code to find it. I will explain how I attempted to solve it and just ask you to point out where I'm wrong. I tried to solve the question both in Java and Common Lisp always getting the same wrong answer. Therefore I'm pretty sure something in my algorithm or my basic assumptions is wrong but I cannot locate it. My guesses on potential mistake points are 1) mistake in set difference 2) mistake in setting limits for parameters.

Here is the algorithm I followed:

Generate all Pythagorean triple perimeters and collect them in a set "A" (list for Common Lisp, hence the time difference). This set will have every single perimeter generated, be it with only one triangle solution or more than one. As it is a set every perimeter will be represented just once.
Collect the recurring (duplicate) perimeters in an additional set "B". This set will include only the perimeters which have more than one triangle solution.
A - B should give me the list of perimeters with single triangle solution. This could be where I'm wrong.

I used the formula I found here to generate the triples, thus the perimeters. I preferred to use the formula with the additional coefficient "k" because the article said,

Despite generating all primitive triples, Euclid's formula does not produce all triples—for example, (9, 12, 15) cannot be generated using integer m and n. This can be remedied by inserting an additional parameter k to the formula.

In order to solve the problem in reasonable time, I needed sensible limits to the parameters in the nested loops. I set the limits for "k" and "m" you will see in the full codes I will present later, with the help of these two little functions:

(defun m-limit (m)
  (if (> (make-peri m 1 1) 1500000)
      m
      (m-limit (1+ m))))

(defun k-limit (k)
  (if (> (make-peri 2 1 k) 1500000)
      k
      (k-limit (1+ k))))

To set the limit for "n" added "a", "b" and "c" in the formula, solved it for for "n" (this could be another point where I made a mistake).

a = k * (m * m - n * n);

b = 2 * k * m * n;

c = k * (m * m + n * n);

k * (m^2 - n^2 + 2mn + m^2 + n^2) <= 1500000

And found this:

nLimit = 1500000 / (2 * k * m) - m;

Here are the codes in Java and Common Lisp. Beware, while Java takes only 2 seconds thanks to HashSet, Common Lips takes 1889 seconds on my laptop, most probably because of checking whether the newly generated perimeter is already a member of the set "A".

Java code:

package euler75v6;

import java.util.HashSet;

/**
 *
 * @author hoyortsetseg
 */
public class Euler75v6 {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        HashSet<Long> peris = new HashSet<>();
        HashSet<Long> duplicatePeris = new HashSet<>();
        peris = periList(865,125000, duplicatePeris);
        System.out.println("Number of all perimeters: " + peris.size());
        System.out.println("Number of duplicate perimeters: " + duplicatePeris.size());
        System.out.println("Number of all perimeters minus number "
                + "of duplicate perimeters: " + (peris.size() - duplicatePeris.size()));
        peris.removeAll(duplicatePeris);
        System.out.println("Same difference, just to confirm. After 'removeAll': " + peris.size());
    }
    private static Long makePeri (long m, long n, long k){
        //Long a, b, c, res;
        //a = k * (m * m - n * n);
        //b = 2 * k * m * n;
        //c = k * (m * m + n * n);
        return 2 * k * (m * m  + m * n);
    }

    private static HashSet<Long> periList (long x, long z, HashSet<Long> dupList){
            HashSet<Long> res = new HashSet<>();
            Long temp, nLimit;
            Long limit = Long.valueOf("1500000");
            for (long k = 1; k <= z; k++){
                for (long m = 2; m <= x; m++){
                    nLimit = 1500000 / (2 * k * m) - m;
                    for (long n = 1; ((n <= nLimit) && (n < m)); n++){
                        temp = makePeri(m,n,k);
                        if (Long.compare(temp, limit) <= 0){ // Should be redundant but just in case.
                            if (res.contains(temp)){
                                dupList.add(temp);
                            }
                            else {
                                res.add(temp);
                            }
                        }
                    }
                }
            }
            return res;
        }    
}

Common Lisp code:

(defun make-peri (m n k)
  (* 2 k (+ (* m m) (* m n))))

(defun peri-list (m n k all-peris dupl-peris)
  (let ((n-limit (- (/ 1500000 (* 2 k m)) m)))
    (cond ((> k 125000) (- (length all-peris)
               (length (remove-duplicates dupl-peris))))
      ((> m 865) (peri-list 2 1 (1+ k) all-peris dupl-peris))
      ((or (>= n m) (> n n-limit))
       (peri-list (1+ m) 1 k all-peris dupl-peris))
      (t (let ((peri (make-peri m n k)))
           (if (> peri 1500000) ;; Redundant with m n k limits but still.
           (peri-list (1+ m) 1 k all-peris dupl-peris)
           (if (member peri all-peris)
               (peri-list m (1+ n) k all-peris (cons peri dupl-peris))
               (peri-list m (1+ n) k (cons peri all-peris)
                  dupl-peris))))))))

(defun result () (peri-list 2 1 1 nil nil))

Any explanation as to where I'm mistaken will be appreciated. But please do not give the correct answer or the code for it.

Edit:

I made a slightly altered version of the Common Lisp code in order to see how the collected lists (sets) look and hopefully see what goes wrong.

I also added bits to automatically set limits to the mnk variables. I also got rid of the redundant "if" that checks whether the perimeter is above the limit as I saw that having mn and k within their limits ensures that the perimeter does not exceed its own limit. Here's how the code looks like after these modifications:

(defun make-peri (m n k)
  (* 2 k (+ (* m m) (* m n))))

(defun peri-list* (m n k limit all-peris dupl-peris)
  (let* ((n-limit (- (/ limit (* 2 k m)) m))
    (k-upper-limit (1- (k-limit 1 limit)))
    (m-upper-limit (1- (m-limit 2 limit)))
    (dupl-peris* (remove-duplicates dupl-peris))
    (difference* (set-difference all-peris dupl-peris*)))
    (cond ((> k k-upper-limit) (list (sort all-peris #'<)
                     (sort dupl-peris* #'<)
                     (sort difference* #'<)))
                    ;; (length all-peris)
                    ;; (length dupl-peris*)
                    ;; (length difference*)))
      ((> m m-upper-limit) (peri-list* 2 1 (1+ k) limit all-peris dupl-peris))
      ((or (>= n m) (> n n-limit))
       (peri-list* (1+ m) 1 k limit all-peris dupl-peris))
      (t (let ((peri (make-peri m n k)))
           (if (member peri all-peris)
           (peri-list* m (1+ n) k limit all-peris (cons peri dupl-peris))
           (peri-list* m (1+ n) k limit (cons peri all-peris) dupl-peris))))))))
(defun m-limit (m limit)
  (if (> (make-peri m 1 1) limit)
      m
      (m-limit (1+ m) limit)))

(defun k-limit (k limit)
  (if (> (make-peri 2 1 k) limit)
      k
      (k-limit (1+ k) limit)))

First I tried it with a small limit to see how it behaved. I had not commented out the (length...) part at first. I saw some behaviour I didn't understand:

CL-USER> (peri-list* 2 1 1 150 nil nil)
((12 24 30 36 40 48 56 60 70 72 80 84 90 96 108 112 120 126 132 140 144 150)
 (24 48 60 72 80 84 90 96 108 112 120 132 140 144) (12 30 36 40 56 70 126 150)
 13 9 8)
CL-USER> (- 22 14)
8
CL-USER> (peri-list* 2 1 1 100 nil nil)
((12 24 30 36 40 48 56 60 70 72 80 84 90 96) (24 48 60 72 80 84 90 96)
 (12 30 36 40 56 70) 11 5 6)
CL-USER> (- 14 8)
6

In the result, the lengths of all-peris* and dupl-peris* did not match what I was counting. However, their difference did match the count.

Afterwards, I commented out the (length...) part, had the program just list the lists and mapcar ed #'length to the result:

CL-USER> (mapcar #'length (peri-list* 2 1 1 100 nil nil))
(14 8 6)

This time, the first two lengths did match the actual count. However, the difference was still the same; meaning I was still getting the wrong answer.

CL-USER> (mapcar #'length (peri-list 2 1 1 nil nil))
(355571 247853)
CL-USER> (- 355571 247853)
107718

This made me question my basic assumptions. So here are my questions.

Specific questions:

Is the algorithm I described at the top of this post correct? Would the difference between all-peris and dupl-peris (after (remove-duplicates...) ) give me the correct answer?
Is this bit of code correctly collecting all perimeters in all-peris and perimeters with more than one triangles in dupl-peris ?

(let ((peri (make-peri mnk))) (if (member peri all-peris) (peri-list* m (1+ n) k limit all-peris (cons peri dupl-peris)) (peri-list* m (1+ n) k limit (cons peri all-peris) dupl-peris)))

What causes the mysterious behaviour of length ?
As I am getting the same result with both Java and Common Lisp, I think this is not a language-specific mistake I am making. Why wouldn't the size of peris.removeAll(duplicatePeris); give me the number of perimeters with only one triangle? Is there some basic mistake I am making with the algorithm or with the difference of a set and its subset?

I hope this helps "unhold" my question.

Edit #2, update of the Java version:

I tried to wrote the Java version of my solution using a HashMap , with perimeters as keys and frequencies of perimeters as values. Here it is:

public static void main(String[] args) {
    HashMap<Long, Long> perisMap = new HashMap<>();
    periMap(865, 125000, perisMap);
    System.out.println("Number of all perimeters (1 triangle, many triangles): " + perisMap.size());
    Long uniqueCounter = Long.valueOf("0");
    for (Map.Entry<Long, Long> entry : perisMap.entrySet()){
        Long freq = entry.getValue();
        if (freq == 1){
            uniqueCounter++;
        }
    }
    System.out.println("Number of all perimeters in the map which appear only once: " + uniqueCounter);
}
private static Long makePeri (long m, long n, long k){
    //Long a, b, c, res;
    //a = k * (m * m - n * n);
    //b = 2 * k * m * n;
    //c = k * (m * m + n * n);
    return 2 * k * (m * m  + m * n);
}
private static void periMap (long x, long z, HashMap<Long, Long> myMap){
        Long nLimit;
        Long limit = Long.valueOf("1500000");
        for (long k = 1; k <= z; k++){
            for (long m = 2; m <= x; m++){
                nLimit = limit / (2 * k * m) - m;
                for (long n = 1; ((n <= nLimit) && (n < m)); n++){
                    Long tempKey = makePeri(m,n,k);
                    Long tempVal = myMap.get(tempKey);
                    if (Long.compare(tempKey, limit) <= 0){
                        if (myMap.containsKey(tempKey)){
                            myMap.put(tempKey, tempVal + 1);
                        }
                        else {
                            myMap.put(tempKey, Long.valueOf("1"));
                        }
                    }
                }
            }
        }
    }

This is what I get when I run it:

Number of all perimeters (1 triangle, many triangles): 355571
Number of all perimeters in the map which appear only once: 107718

The result is the same as the old Java version and the Common Lisp version with lists. I am currently trying to write a new Common Lisp version using a hash-table.

Question:
This is the third version with the same result. Clearly, there is something wrong with my logic/algorithm/maths. Any pointers as to where?

Answer 1

To use Euler's formula for this problem, you must follow all the rules that guarantee each triple will be generated exactly once . Otherwise you'll generate the same triple multiple times, and you'll skip valid lengths because their counts are excessively high.

The rules include using only (m,n) pairs that are co-prime and where m and n are not both odd.

I think if you add checks to avoid invalid pairs according to these rules, your algorithm will be correct. At least it will be much closer.

An additional comment on your Java code: It's seldom useful to declare a Long variable. Declare long and let autoboxing take care of conversion as needed. In general, your use of the Long type is weird. Eg Long.valueOf("1") can be replaced by 1 or 1L . And similarly, Long.compare(tempKey, limit) <= 0 should be tempKey <= limit .

In fact, long is unnecessary for this problem. It can be done with entirely with int .

Lisp

The simplest way to keep track of how many of each length you've generated is with an array of small integers. Here's the idea in Common Lisp:

(defun count-triangles (limit)
  (let ((counts (make-array (1+ limit)
                  :element-type 'unsigned-byte
                  :initial-element 0))
        (result 0))
    (loop for m from 2 to (ceiling (sqrt limit)) do
      (loop for n from 1 to (1- m)
        for k1-len = (* 2 m (+ m n)) then (+ k1-len (* 2 m))
        while (<= k1-len limit)
        when (and (oddp (+ m n)) (= (gcd m n) 1))
        do (loop for len = k1-len then (+ len k1-len)
             while (<= len limit)
             do (case (aref counts len)
                  (0 (incf result)
                     (incf (aref counts len)))
                  (1 (decf result)
                     (incf (aref counts len)))))))
    result))

In compiled CLisp, this takes about 0.5 seconds. The equivalent in Java below runs in 0.014 seconds on my old MacBook.

static int count() {
  byte [] count = new byte[MAX + 1];
  int result = 0;
  for (int m = 2; m < SQRT_MAX; ++m) {
    for (int n = 1; n < m; ++n) {
      if (((m ^ n) & 1) == 0 || gcd(m, n) > 1) continue;
      int base_len = 2 * m * (n + m);
      if (base_len > MAX) break;
      for (int len = base_len ; len <= MAX; len += base_len) { 
        switch (count[len]) {
        case 0:
          ++result;
          count[len] = 1;
          break;
        case 1:
          --result;
          count[len] = 2;
          break;
        default:
          break;
        }
      }
    }
  }
  return result;
}

Cannot locate my mistake in an algorithm involving Pythagorean triples and sets

Question

1 answers

solution1
1 ACCPTED 2017-03-25 02:47:50

Cannot locate my mistake in an algorithm involving Pythagorean triples and sets

Question

1 answers

solution1 1 ACCPTED 2017-03-25 02:47:50

solution1
1 ACCPTED 2017-03-25 02:47:50