简体   繁体   中英

Replace Duplicates in Clojure Vector

I am trying to replace duplicates in a vector with empty strings. However, the only functions I can find are to remove duplicates, not replace them. How can I take

["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"]

and output:

["Oct 2016" "" "Nov 2016" "" "" ""]

Everything I can find will return ["Oct 2016" "Nov 2016"] I'm currently achieving the desired output by doing a nested doseq , but it seems inefficient. Is there a better way to achieve this? Thanks!

Here is a strategy for a solution.

  1. loop over the items of the vector.
  2. Maintain a set of visited items. It can be used to check for uniqueness.
  3. For each item: if the set contains the current item then insert "" into the result vector.
  4. If the current item is unique then insert it into the result vector and also the set.
  5. Return the result vector when all items are visited.
  6. Optionally: Use a transient result vector for better performance.

Code:

(defn duplicate->empty [xs]
  (loop [xs     (seq xs)
         result []
         found #{}]
        (if-let [[x & xs] (seq xs)]
          (if (contains? found x)
            (recur xs (conj result "") found)
            (recur xs (conj result x) (conj found x)))
          result)))

Calling it:

(duplicate->empty ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"])
=> ["Oct 2016" "" "Nov 2016" "" "" ""]

Transducer version just for completeness.

(defn empty-duplicates
  ([]
   (fn [rf]
     (let [seen (volatile! #{})]
       (fn
         ([] (rf))
         ([res] (rf res))
         ([res x]
          (if (contains? @seen x)
            (rf res "")
            (do (vswap! seen conj x)
                (rf res x))))))))
  ([coll]
   (sequence (empty-duplicates) coll)))

(comment

  (def months ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"])

  (into [] (empty-duplicates) months) ;=> ["Oct 2016" "" "Nov 2016" "" "" ""]

  )
(defn eliminate-duplicates [v]
        (let [result (transient (vec (repeat (count v) "")))
              index-of-first-occurences (apply merge-with #(first %&) (map-indexed (fn [x y] {y x}) v))]
            (doall (for [[s pos] index-of-first-occurences]
                       (assoc! result pos s)))
            (persistent! result)))

basically the same as above, but using lazy sequence generation:

(defn rdups
  ([items] (rdups #{} items))
  ([found [x & xs :as items]]
   (when (seq items)
     (if (contains? found x)
       (lazy-seq (cons "" (rdups found xs)))
       (lazy-seq (cons x (rdups (conj found x) xs)))))))

user> (rdups ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"])
;;=> ("Oct 2016" "" "Nov 2016" "" "" "")

You could use iterate :

(def months ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"])

(defn step [[[head & tail] dups res]]
  [tail
   (conj dups head)
   (conj res (if (dups head)
               ""
               head))])

(defn empty-dups [xs]
  (->> (iterate step [xs #{} []])
       (drop-while (fn [[[head] _ _]] head))
       (map #(nth % 2))
       first))

(empty-dups months)
;; => ["Oct 2016" "" "Nov 2016" "" "" ""]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM