简体   繁体   English

如何在clojure中获取地图的嵌套键?

[英]How can I get the nested keys of a map in clojure?

if my structure is如果我的结构是

{ :a :A
  :b :B
  :c {
       :d :D
     }
  :e {
       :f {
            :g :G
            :h :H
          }
     }
}

I would like to get a function called keys-in that returns something like:我想获得一个名为keys-in的函数keys-in它返回如下内容:

[[:a] [:b] [:c :d] [:e :f :g] [:e :f :h]]

so then I can do something like:那么我可以做类似的事情:

(not-any? nil? (map #(get-in my-other-map %1) (keys-in my-map)))

So I can be sure that my-other-map has the same keys that my-map所以我可以确定my-other-mapmy-map具有相同的键

(defn keys-in [m]
  (if (map? m)
    (vec 
     (mapcat (fn [[k v]]
               (let [sub (keys-in v)
                     nested (map #(into [k] %) (filter (comp not empty?) sub))]
                 (if (seq nested)
                   nested
                   [[k]])))
             m))
    []))

;; tests
user=> (keys-in nil)
[]
user=> (keys-in {})
[]
user=> (keys-in {:a 1 :b 2}))
[[:a] [:b]]
user=> (keys-in {:a {:b {:c 1}}})
[[:a :b :c]]
user=> (keys-in {:a {:b {:c 1}} :d {:e {:f 2}}})
[[:a :b :c] [:d :e :f]]
(defn keys-in [m]
  (if (or (not (map? m))
          (empty? m))
    '(())
    (for [[k v] m
          subkey (keys-in v)]
      (cons k subkey))))

Obligatory zippers version强制拉链版本

(require '[clojure.zip :as z])

(defn keys-in [m] 
  (letfn [(branch? [[path m]] (map? m)) 
          (children [[path m]] (for [[k v] m] [(conj path k) v]))] 
    (if (empty? m) 
      []
      (loop [t (z/zipper branch? children nil [[] m]), paths []] 
        (cond (z/end? t) paths 
              (z/branch? t) (recur (z/next t), paths) 
              :leaf (recur (z/next t), (conj paths (first (z/node t)))))))))

If you don't need a lazy result and just want to be fast, try using reduce-kv .如果您不需要懒惰的结果而只想快速,请尝试使用reduce-kv

(defn keypaths
  ([m] (keypaths [] m ()))
  ([prev m result]
   (reduce-kv (fn [res k v] (if (map? v)
                              (keypaths (conj prev k) v res)
                              (conj res (conj prev k))))
              result
              m)))

If you also want to support vector indices (as with get-in or update-in ), test with associative?如果您还想支持向量索引(如get-inupdate-in ),请使用associative?测试associative? instead of map?而不是map? . . If you want intermediate paths, you can conj those on too.如果你想要中间路径,你也可以连接它们。 Here's a variant:这是一个变体:

(defn kvpaths-all2
  ([m] (kvpaths-all2 [] m ()))
  ([prev m result]
   (reduce-kv (fn [res k v] (if (associative? v)
                              (let [kp (conj prev k)]
                                (kvpaths-all2 kp v (conj res kp)))
                              (conj res (conj prev k))))
              result
              m)))

You can build this with clojure.zip or tree-seq fairly easily though I strongly prefer the prismatic.schema library for verifying the structure of nested maps您可以使用 clojure.zip 或 tree-seq 轻松构建它,尽管我非常喜欢使用Prismatic.schema库来验证嵌套映射的结构

user> (def my-data-format                                 
  {:a Keyword                                             
   :b Keyword                                             
   :c {:d Keyword}                                        
   :e {:f {:g Keyword                                     
           :h Keyword}}})                                 
#'user/my-data-format                                     
user> (def some-data                                      
         {:a :A                                            
          :b :B                                            
          :c {:d :D}                                       
          :e {:f {:g :G                                    
                  :h :G}}})                                
#'user/some-data                                          
user> (schema/validate my-data-format some-data)          
{:a :A, :c {:d :D}, :b :B, :e {:f {:g :G, :h :G}}}
user> (def some-wrong-data
        {:a :A
         :b :B
         :c {:wrong :D}
         :e {:f {:g :G
                 :h :G}}})
 #'user/some-wrong-data             

 user> (schema/validate my-data-format some-wrong-data)  

ExceptionInfo Value does not match schema: 
{:c {:d missing-required-key, 
     :wrong disallowed-key}}  
schema.core/validate (core.clj:132)

Got a similar question, wasn't satisfied by current solutions:有一个类似的问题,对当前的解决方案不满意:

"Naive" recursive approach “朴素”的递归方法

(require '[clojure.set :as set])

(defn all-paths
  ([m current]
   ;; base case: map empty or not a map
   (if (or (not (map? m)) (empty? m))
     #{current}
   ;; else: recursive call for every (key, value) in the map
     (apply set/union #{current}
            (map (fn [[k v]]
                   (all-paths v (conj current k)))
                 m))))
  ([m]
   (-> m (all-paths []) (disj []))))


(all-paths {:a 1
            :b 2
            :c {:ca 3
                :cb {:cba 4
                     :cbb 5}}
            :d {:da 6
                :db 7}})
=> #{[:a] [:b] [:c] [:d] [:c :ca] [:c :cb] [:d :da] [:d :db] [:c :cb :cba] [:c :cb :cbb]}

Here are solutions (without intermediate paths) using Specter .以下是使用Spectre 的解决方案(没有中间路径)。 They're by Nathan Marz, Specter's author, from a conversation on the Specter Slack channel (with his permission).它们由 Spectre 的作者 Nathan Marz 撰写,来自 Spectre Slack 频道上的一次对话(经他许可)。 I claim no credit for these definitions.我认为这些定义没有功劳。

Simple version:简单版:

(defn keys-in [m]
  (let [nav (recursive-path [] p
              (if-path map?
                [ALL (collect-one FIRST) LAST p]
                STAY))]
    (map butlast (select nav m))))

More efficient version:更高效的版本:

(defn keys-in [m]
  (let [nav (recursive-path [] p
              (if-path map?
                [ALL
                 (if-path [LAST map?]
                  [(collect-one FIRST) LAST p]
                  FIRST)]))]
    (select nav m)))

My informal explanation of what's happening in these definitions:我对这些定义中发生的事情的非正式解释:

In the simple version, since the top-level argument is a map, if-path map?在简单版本中,由于顶级参数是一个映射, if-path map? passes it to the first collection of navigators in brackets.将它传递给括号中的第一个导航器集合。 These begin with ALL , which says here to do the rest for each element in the map.这些以ALL开头,表示这里为地图中的每个元素完成其余的工作。 Then for each MapEntry in the map, (collect-one FIRST) says to add its first element (key) to the result of passing its last element (val) to if-path again.然后对于地图中的每个MapEntry(collect-one FIRST)说将其第一个元素 (key) 添加到将其最后一个元素 (val) 再次传递给if-path p was bound by recursive-path to be a reference to that same recursive-path expression. precursive-path绑定为对相同recursive-path表达式的引用。 By this process we eventually get to a non-map.通过这个过程,我们最终得到了一个非地图。 Return it and stop processing on that branch;返回并停止在该分支上处理; that's what STAY means.这就是STAY意思。 However, this last thing returned is not one of the keys;但是,返回的最后一件事不是关键之一; it's the terminal val.这是终端 val。 So we end up with the leaf vals in each sequence.所以我们最终得到了每个序列中的叶值。 To strip them out, map butlast over the entire result.butlast它们, butlast在整个结果上映射butlast

The second version avoids this last step by only recursing into the val in the MapEntry if that val is itself a map.如果该 val 本身是一个映射,则第二个版本通过仅递归到MapEntry中的 val 来避免这最后一步。 That's what the inner if-path does: [LAST map?] gets the last element, ie the val of the current MapEntry generated by ALL , and passes it to map?这就是内部if-path所做的: [LAST map?]获取最后一个元素,即ALL生成的当前MapEntry的 val,并将其传递给map? . .


I used Criterium to test all of the key path functions on this page that don't return intermediate paths, plus one by noisesmith that's part of an answer to another question .我使用 Criterium 来测试此页面上所有不返回中间路径的关键路径函数,加上一个由 Noisesmith 提供的,它是另一个问题答案的一部分。 For a 3-level, 3 keys per level map and for a 6-level, 6 keys per level map, miner49r's version and the second, faster Specter version have similar speeds, and are much faster than any of the other versions.对于 3 级、每级 3 个键的地图和 6 级、每级 6 个键的地图,miner49r 的版本和第二个更快的 Spectre 版本具有相似的速度,并且比任何其他版本都快得多。

Timings on a 3-level, 3 keys per level (27 paths) map, in order: 3 级、每级 3 个键(27 条路径)地图上的时间,按顺序:

  • miner49r's: 29.235649 µs miner49r 的:29.235649 微秒
  • N. Marz's second Specter: 30.590085 µs N. Marz 的第二个幽灵:30.590085 µs
  • N. Marz's first Specter: 62.840230 µs N. Marz 的第一个 Spectre:62.840230 µs
  • amalloy's: 75.740468 µs合金:75.740468 µs
  • noisesmith's (from the other question): 87.693425 µs噪音史密斯(来自另一个问题):87.693425 µs
  • AWebb's: 162.281035 µs AWebb 的:162.281035 µs
  • AlexMiller's (without vec ): 243.756275 µs AlexMiller 的(没有vec ):243.756275 µs

Timings on a 6-level, 6 keys per level (6^6 = 46656 paths) map, in order: 6 级、每级 6 个键(6^6 = 46656 条路径)映射的时序,按顺序:

  • N. Marz's second Specter: 34.435956 ms N. Marz 的第二个幽灵:34.435956 毫秒
  • miner49r's: 37.897345 ms miner49r:37.897345 毫秒
  • N. Marz's first Specter: 119.600975 ms N. Marz 的第一个 Spectre:119.600975 毫秒
  • noisesmith's: 180.448860 ms噪音史密斯:180.448860 毫秒
  • amalloy's: 191.718783 ms合金:191.718783 毫秒
  • AWebb's: 193.172784 ms AWebb 的:193.172784 毫秒
  • AlexMiller's (without vec ): 839.266448 ms AlexMiller 的(没有vec ):839.266448 毫秒

All calls were wrapped in doall so that lazy results would be realized.所有调用都包含在doall以便实现懒惰的结果。 Since I was doall ing them, I took out vec wrapper in Alex Miller's definition.由于我doall荷兰国际集团他们,我拿出vec在亚历克斯·米勒的定义包装。 Full details about timings can be found here .可以在此处找到有关计时的完整详细信息。 The test code is here .测试代码在这里

(The simple Specter version is slower than the faster version because of the use of map butlast to strip out the leaf values. If this is step is removed, the simple Specter definition's times are similar to those of the second definition.) (简单的 Spectre 版本比更快的版本慢,因为使用map butlast叶值。如果删除这一步,简单的 Spectre 定义的时间与第二个定义的时间相似。)

This answer of mine is just to illustrate how NOT to do it since it is still procedural.我的这个答案只是为了说明如何不这样做,因为它仍然是程序性的。

(defn keys-in [data] (genkeys [] data))

(defn genkeys [parent data]
  (let [mylist (transient [])]
    (doseq [k (keys data)]
      (do
        (if ( = (class (k data)) clojure.lang.PersistentHashMap )
          (#(reduce conj! %1 %2) mylist (genkeys (conj parent  k ) (k data) ))
          (conj! mylist  (conj parent  k ) )
          )))
    (persistent! mylist)))

Here is an implementation which returns all keys (not just the terminal keys) based on lazy-seq:这是一个基于惰性序列返回所有键(不仅仅是终端键)的实现:

(defn keys-in
  ([m] (if (map? m) (keys-in (seq m) [])))
  ([es c]
   (lazy-seq
    (when-let [e (first es)]
      (let [c* (conj c (key e))]
        (cons c* (concat (if (map? (val e)) (keys-in (seq (val e)) c*))
                         (keys-in (rest es) c))))))))

Working on something similar for a personal project and this is my naive implementation:为个人项目做类似的事情,这是我天真的实现:

(defn keys-in
  [m parent-keys]
  (mapcat (fn [[k v]]
        (if (map? v)
          (keys-in v (conj parent-keys k))
          (vector (conj parent-keys k v))))
      m))

Use it from the repl:从 repl 使用它:

(keys-in <your-map> [])

Fancy way:花式方式:

(map (comp vec drop-last) (keys-in <your-map> []))

Here is a generic solution for known collection types, including maps (look for "Key Paths" on the Readme page for usage examples).是已知集合类型的通用解决方案,包括映射(在自述文件页面上查找“关键路径”以获取使用示例)。

It handles mixed types as well (sequential types, maps and sets), and the API (protocols) can be extended to other types.它也处理混合类型(顺序类型、映射和集合),并且 API(协议)可以扩展到其他类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM