简体   繁体   中英

Merging Arrays in Clojure

I need to merge a collection of arrays based on id.

Example data:

EDIT: (changed to match Clojure data structures)

 [{:id 1, :region :NA, :name :Test1, :OS :W}
  {:id 1, :region :EU, :name :Test2, :OS :W}
  {:id 2, :region :AS, :name :test3, :OS :L}
  {:id 2, :region :AS, :name :test4, :OS :M}]

Becomes:

EDIT: (changed to match Clojure data structures)

[{:id 1, :region [:NA :EU], :name [:Test1 :Test2] ,:OS [:W]}
 {:id 2, :region [:AS] :name [:test3 :Test4], :OS [:L :M]}]

| is the delimiter (changeable) If possible, also would like alphabetical order as well.

You can use some combination of functions from clojure.set (if you change the outermost vector to set). Specifically clojure.set/index looks promising.

(def data
 [{:id 1, :region :NA, :name :Test1, :OS :W}
  {:id 1, :region :EU, :name :Test2, :OS :W}
  {:id 2, :region :AS, :name :test3, :OS :L}
  {:id 2, :region :AS, :name :test4, :OS :M}])

(defn key-join
  "join of map by key , value is distinct."
  [map-list]
  (let [keys (keys (first map-list))]
       (into {} (for [k keys] [k (vec (set (map #(% k) map-list)))]))))

(defn group-reduce [key map-list]
  (let [gdata (group-by key map-list)]
    (into [] (for [[k m] gdata] (let [m (key-join m)](assoc m key ((key m) 0)))))))



user=> (group-reduce :id data)
[{:name [:Test2 :Test1], :OS [:W], :region [:EU :NA], :id 1} {:name [:test3 :test4], :OS [:L :M], :region [:AS], :id 2}]

You can use the merge-with function as shown below in the example.

Firstly, we define some helper functions

(defn collect [& xs]
  (apply vector (-> xs distinct sort)))

The collect function makes sure that the items in xs are unique and sorted and finally returns them in a vector.

(defn merge-keys [k xs]
  (map #(apply merge-with collect %) (vals (group-by k xs))))

merge-keys first groups the hash-maps in xs by a primary key (in your case :id ), takes each list of grouped items and merges the values of the keys using the collect function from above.

(def xs [{:id 1, :region :NA, :name :Test1, :OS :W}
         {:id 1, :region :EU, :name :Test2, :OS :W}
         {:id 2, :region :AS, :name :test3, :OS :L}
         {:id 2, :region :AS, :name :test4, :OS :M}])

(merge-keys :id xs)
=> ({:id [1],
     :region [:EU :NA],
     :name [:Test1 :Test2],
     :OS [:W]}
    {:id [2],
     :region [:AS],
     :name [:test3 :test4],
     :OS [:L :M]})

Note however that even the :id key now has vector associated with it. You can easily un-vector it by either introducing an if statement in collect which associates a single value with the key instead of a vector...

(defn collect [& xs]
  (let [cs (apply vector (-> xs distinct sort))]
    (if (= 1 (count cs)) (first cs) cs)))

...or take the result from merge-keys and do

(map #(update-in % [:id] first) result)

which will only un-vector the :id map entry

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM