简体   繁体   中英

Clojure/dataset: group-by multiple columns hierarchically?

I would like to implement a function that can group-by for multiple columns hierarchically. I can illustrate my requirement by the following tentative implementation for two columns:

(defn group-by-two-columns-hierarchically
  [col1 col2 table]
  (let [data-by-col1 ($group-by col1 table)
        data-further-by-col2 (into {} (for [[k v] data-by-col1] [k ($group-by col2 v)]))
        ]
    data-further-by-col2
    ))

I'm seeking help how to generalize on arbitrary number of columns.

(I understand that Incanter supports group-by for multiple columns but it only provides a structure not hierarchy, a map of composite key of multiple columns to value of datasets.)

Thanks for your help!

Note: to make Michał's solution work for incanter dataset, only a slight modification is needed, replacing "group-by" by "incanter.core/$group-by", illustrated by the following experiment:

(defn group-by*
      "Similar to group-by, but takes a collection of functions and returns
      a hierarchically grouped result."
      [fs coll]
      (if-let [f (first fs)]
        (into {} (map (fn [[k vs]]
                        [k (group-by* (next fs) vs)])
                   (incanter.core/$group-by f coll)))
        coll))

(def table (incanter.core/dataset ["x1" "x2" "x3"]
                                      [[1 2 3]
                                       [1 2 30]
                                       [4 5 6]
                                       [4 5 60]
                                       [7 8 9]
                                       ]))


(group-by* [:x1 :x2] table)
=>
    {{:x1 1} {{:x2 2} 
        | x1 | x2 | x3 |
        |----+----+----|
        |  1 |  2 |  3 |
        |  1 |  2 | 30 |
        }, 
    {:x1 4} {{:x2 5} 
        | x1 | x2 | x3 |
        |----+----+----|
        |  4 |  5 |  6 |
        |  4 |  5 | 60 |
        }, 
    {:x1 7} {{:x2 8} 
        | x1 | x2 | x3 |
        |----+----+----|
        |  7 |  8 |  9 |
        }}
(defn group-by*
  "Similar to group-by, but takes a collection of functions and returns
  a hierarchically grouped result."
  [fs coll]
  (if-let [f (first fs)]
    (into {} (map (fn [[k vs]]
                    [k (group-by* (next fs) vs)])
               (group-by f coll)))
    coll))

Example:

user> (group-by* [:foo :bar :quux]
        [{:foo 1 :bar 1 :quux 1 :asdf 1}
         {:foo 1 :bar 1 :quux 2 :asdf 2}
         {:foo 1 :bar 2 :quux 1 :asdf 3}
         {:foo 1 :bar 2 :quux 2 :asdf 4}
         {:foo 2 :bar 1 :quux 1 :asdf 5}
         {:foo 2 :bar 1 :quux 2 :asdf 6}
         {:foo 2 :bar 2 :quux 1 :asdf 7}
         {:foo 2 :bar 2 :quux 2 :asdf 8}
         {:foo 1 :bar 1 :quux 1 :asdf 9}
         {:foo 1 :bar 1 :quux 2 :asdf 10}
         {:foo 1 :bar 2 :quux 1 :asdf 11}
         {:foo 1 :bar 2 :quux 2 :asdf 12}
         {:foo 2 :bar 1 :quux 1 :asdf 13}
         {:foo 2 :bar 1 :quux 2 :asdf 14}
         {:foo 2 :bar 2 :quux 1 :asdf 15}
         {:foo 2 :bar 2 :quux 2 :asdf 16}])
{1 {1 {1 [{:asdf 1, :bar 1, :foo 1, :quux 1}
          {:asdf 9, :bar 1, :foo 1, :quux 1}],
       2 [{:asdf 2, :bar 1, :foo 1, :quux 2}
          {:asdf 10, :bar 1, :foo 1, :quux 2}]},
    2 {1 [{:asdf 3, :bar 2, :foo 1, :quux 1}
          {:asdf 11, :bar 2, :foo 1, :quux 1}],
       2 [{:asdf 4, :bar 2, :foo 1, :quux 2}
          {:asdf 12, :bar 2, :foo 1, :quux 2}]}},
 2 {1 {1 [{:asdf 5, :bar 1, :foo 2, :quux 1}
          {:asdf 13, :bar 1, :foo 2, :quux 1}],
       2 [{:asdf 6, :bar 1, :foo 2, :quux 2}
          {:asdf 14, :bar 1, :foo 2, :quux 2}]},
    2 {1 [{:asdf 7, :bar 2, :foo 2, :quux 1}
          {:asdf 15, :bar 2, :foo 2, :quux 1}],
       2 [{:asdf 8, :bar 2, :foo 2, :quux 2}
          {:asdf 16, :bar 2, :foo 2, :quux 2}]}}}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM