繁体   English   中英

如何在Clojure中删除vector中的顺序匹配?

[英]How to remove sequential matches in vector in Clojure?

假设我有一个向量["a" "b" "c" "a" "a" "b"] 如果给定序列["a" "b"] ,我该如何删除该序列的所有实例(按顺序)? 在这里,结果只是["c" "a"]

如果事先知道需要删除的序列,则core.match可能对您的任务有用:

(require '[clojure.core.match :refer [match]])

(defn remove-patterns [seq]
  (match seq
    ["a" "b" & xs] (remove-patterns xs)
    [x & xs] (cons x (remove-patterns xs))
    [] ()))


(remove-patterns ["a" "b" "c" "a" "a" "b"]) ;; => ("c" "a")

简短的回答是将其视为字符串并执行正则表达式删除:

(defn remove-ab [v]
  (mapv str (clojure.string/replace (apply str v) #"ab" "")))

(remove-ab ["a" "b" "c" "a" "a" "b"])
=> ["c" "a"]

很长的答案是通过迭代序列,识别匹配并返回没有它们的序列来实现自己的正则表达式状态机。

Automat可以帮助您制作自己的低级正则表达式状态机: https//github.com/ztellman/automat

Instaparse可用于制作丰富的语法: https//github.com/Engelberg/instaparse

你真的不需要一个用于这么小的匹配的库,你可以将它实现为循环:

(defn remove-ab [v]
  (loop [[c & remaining] v
         acc []
         saw-a false]
    (cond
     (nil? c) (if saw-a (conj acc "a") acc) ;; terminate
     (and (= "b" c) saw-a) (recur remaining acc false)  ;; ignore ab
     (= "a" c) (recur remaining (if saw-a (conj acc "a") acc) true) ;; got a
     (and (not= "b" c) saw-a) (recur remaining (conj (conj acc "a") c) false) ;; keep ac
     :else (recur remaining (conj acc c) false)))) ;; add c

但是,正确处理所有条件可能会非常棘手......因此正式的正则表达式或状态机是有利的。

或者递归定义:

(defn remove-ab [[x y & rest]]
  (cond
   (and (= x "a") (= y "b")) (recur rest)
   (nil? x) ()
   (nil? y) [x]
   :else (cons x (remove-ab (cons y rest)))))

2元素子序列的递归解决方案:

(defn f [sq [a b]]
  (when (seq sq)
    (if 
      (and
        (= (first sq) a)
        (= (second sq) b))
      (f (rest (rest sq)) [a b]) 
      (cons (first sq) (f (rest sq) [a b])))))

没有详尽的测试,但似乎工作。

使用lazy-seq简单解决方案,对任何有限子参数和任何需要过滤的(包括无限)序列进行takedrop

(defn remove-subseq-at-start
  [subseq xs]
  (loop [xs xs]
    (if (= (seq subseq) (take (count subseq) xs))
      (recur (drop (count subseq) xs))
      xs)))

(defn remove-subseq-all [subseq xs]
  (if-let [xs (seq (remove-subseq-at-start subseq xs))]
    (lazy-seq (cons (first xs) (remove-subseq subseq (rest xs))))
    ()))

(deftest remove-subseq-all-test
  (is (= ["c" "a"] (remove-subseq-all ["a" "b"] ["a" "b" "a" "b" "c" "a" "a" "b"])))
  (is (= ["a"] (remove-subseq-all ["a" "b"] ["a"])))
  (is (= ["a" "b"] (remove-subseq-all [] ["a" "b"])))
  (is (= [] (remove-subseq-all ["a" "b"] ["a" "b" "a" "b"])))
  (is (= [] (remove-subseq-all ["a" "b"] nil)))
  (is (= [] (remove-subseq-all [] [])))
  (is (= ["a" "b" "a" "b"] (->> (remove-subseq-all ["c" "d"] (cycle ["a" "b" "c" "d"]))
                                (drop 2000000)
                                (take 4))))

  (is (= (seq "ca") (remove-subseq-all "ab" "ababcaab"))))

如果可以确保输入是向量,我们可以使用subvec检查每个元素,以下相同长度的子向量是否与模式匹配。 如果是这样,我们省略它,否则我们继续前进到向量中的下一个元素:

(let [pattern ["a" "b"]
      source ["a" "b" "c" "a" "a" "b"]]
    (loop [source source
           pattern-length (count pattern)
           result []]
        (if (< (count source) pattern-length)
            (into [] (concat result source))
            (if (= pattern (subvec source 0 pattern-length))
              ; skip matched part of source
              (recur (subvec source pattern-length) pattern-length result)
              ; otherwise move ahead one element and save it as result
              (recur (subvec source 1) pattern-length 
                     (conj result (first source)))))))

对于一般序列,您可以使用相同的方法,在适当时替换takedrop

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM