简体   繁体   中英

Clojure - Merge two vectors of vectors different sizes

Here I am again facing some problems with Clojure. I have two vectors of vectors.

[[a b c] [d e f] [g h i]]

and

[[a b] [d e] [g h] [j k]]

And I wanna merge these two in a way that the final vector would be something like this:

 [[a b c] [d e f] [g h i] [j k l]]

In the output, the last item [jkl], the L is a constant value when there is no value to merge (because it has no corresponding item in the first vector. How can I do such thing?

PS: I am new to Clojure and I appreciate a elaborated answer so that I could understand better. Also, sorry if this is a trivial question.

In general:

  • break the problem into separable parts
  • give things names
  • compose the parts

So in this case your problem can be broken down into:

  • splitting the lists into the overlapping and non-overlapping parts
  • choosing the best of each of the overlapping parts
  • padding the non-overlapping parts to the correct length
  • combining them back together.

So if I make a couple assumptions about your problem here is an example of breaking it down and building it back up:

user> (def a '[[a b c] [d e f] [g h i]])
#'user/a
user> (def b '[[a b] [d e] [g h] [j k]])
#'user/b

make a function to choose the correct pair of the overlapping parts. I chose length though you can merge these however you want:

user> (defn longer-list [x y]
        (if (> (count x) (count y))
          x
          y))
#'user/longer-list

make a function to pad out a list that's too short

user> (defn pad-list [l min-len default-value]
        (into l (take (- min-len (count l)) (repeat default-value))))
#'user/pad-list

Make a function that uses these two functions to split and then recombine the parts of the problem:

user> (defn process-list [a b]
        (let [a-len (count a)
              b-len (count b)
              longer-input (if (> a-len b-len)
                            a
                            b)
              shorter-input (if (< a-len b-len)
                            a
                            b)]
          (concat (map longer-list longer-input shorter-input)
                  (map #(pad-list % 3 'l) (drop (count shorter-input) longer-input)))))
#'user/process-list

and then test it :-)

user> (process-list a b)
([a b c] [d e f] [g h i] [j k l]) 

There are more details to work out, like what happens when the lists-of-lists are the same length, and if they are not subsets of each other. (and yes you can smash this down to a "one liner" too)

I'd take a look at clojure.core.matrix (see here ); It has some nice operations which could help you with this.

i would generally go with the following approach:

  1. fill collections up to the size of the longest one
  2. map both of them, filling every item of the collection up to the size of the longest, mapping items to select the resulting value.

It is better to illustrate it with code:

first of all let's make up some helper functions:

(defn max-count [coll1 coll2] (max (count coll1) (count coll2)))

it's name says for itself.

(defn fill-up-to [coll size] (take size (concat coll (repeat nil))))

this one fills the collection with nil s up to some size:

user> (fill-up-to [1 2 3] 10)
(1 2 3 nil nil nil nil nil nil nil)

now the merge function:

(defn merge-colls [v1 v2 default-val]
  (let [coll-len (max-count v1 v2)
        comp-len (max-count (first v1) (first v2))]
    (mapv (fn [comp1 comp2]
            (mapv #(or %1 %2 default-val)
                  (fill-up-to comp1 comp-len)
                  (fill-up-to comp2 comp-len)))
          (fill-up-to v1 coll-len)
          (fill-up-to v2 coll-len))))

the outer mapv operates on collections made from initial parameters filled up to the length of the longest one ( coll-len ), so in context of the question it will be:

(mapv some-fn [[a b c] [d e f] [g h i] nil]]
              [[a b]   [d e]   [g h]   [j k]])

the inner mapv operates on inner vectors, filled up to the comp-len (3 in this case):

(mapv #(or %1 %2 default-val) '[a b c] '[d e nil])
...
(mapv #(or %1 %2 default-val) '[nil nil nil] '[j k nil])

let's test it:

user> (let [v1 '[[a b c] [d e f] [g h i]]
            v2 '[[a b] [d e] [g h] [j k]]]
        (merge-colls v1 v2 'l))
[[a b c] [d e f] [g h i] [j k l]]

ok it works just as we wanted.

now if you look at the merge-colls , you may notice the repetition of the pattern:

(mapv some-fn (fill-up-to coll1 size)
              (fill-up-to coll2 size))

we can eliminate the duplication by moving this pattern out to a function:

(defn mapv-equalizing [map-fn size coll1 coll2]
  (mapv map-fn (fill-up-to coll1 size) (fill-up-to coll2 size)))

and rewrite our merge:

(defn merge-colls [v1 v2 default-val]
  (let [coll-len (max-count v1 v2)
        comp-len (max-count (first v1) (first v2))]
    (mapv-equalizing (fn [comp1 comp2]
                       (mapv-equalizing #(or %1 %2 default-val) 
                                        comp-len comp1 comp2))
                     coll-len v1 v2)))

test:

user> (let [v1 '[[a b c] [d e f] [g h i]]
            v2 '[[a b] [d e] [g h] [j k]]]
        (merge-colls v1 v2 'l))
[[a b c] [d e f] [g h i] [j k l]]

ok. now we can shorten it by removing collection size bindings, as we need these values just once:

(defn merge-colls [v1 v2 default-val]
  (mapv-equalizing
   (partial mapv-equalizing
            #(or %1 %2 default-val)
            (max-count (first v1) (first v2)))
   (max-count v1 v2) v1 v2))

in repl:

user> (let [v1 '[[a b c] [d e f] [g h i]]
            v2 '[[a b] [d e] [g h] [j k]]]
        (merge-colls v1 v2 'l))
[[a b c] [d e f] [g h i] [j k l]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM