簡體   English   中英

從列表中刪除重復項並將其組合在Lisp中

[英]removing duplicates from a list and combining it in Lisp

我有以下類型的列表

(("abc" "12" "45")
 ("abc" "34" "56")
 ("cdb" "56" "78")
 ("deg" "90" "67")
 ("deg" "45" "34"))

和期望的輸出是

(("abc" "12" "45" "34" "56")
 ("cdb" "56" "78")
 ("deg" "90" "67" "45 "34)).

在Lisp中,同樣的方法是什么?

在Common Lisp中,一種可能性是這樣的:

(defun merge-lists (lists)
  (let ((rv (make-hash-table :test #'equal)))
         (mapcar (lambda (list)
           (mapcar (lambda (x) (push x (gethash (car list) rv nil))) (cdr list)))
                   lists)
    (loop for key being the hash-keys of rv
          collect (cons key (reverse (gethash key rv))))))

這個帖子已經有了很多很棒的答案。 但是,由於沒有人提到Common Lisp集合操作,我以為我會用自己的方式管道。

假設你的數據真的像這樣:

'((("abc") ("12" "45"))
  (("abc") ("34" "56"))
  (("cdb") ("56" "78"))
  (("deg") ("90" "67"))
  (("deg") ("45" "34")))

,即一個與一系列值配對的密鑰表。 而你想要的是合並給定鍵的值,而不僅僅是追加它們,然后Common Lisp有一系列直接的操作來做到這一點。 只需使用assocunion 注意,union的工作原理如下:

(setf record1 '("abc" "12" "34" "56"))
(setf record2 ' ("abc" "56" "45" "43"))
(union (cdr record1) (cdr record2) :test #'string=)

=> ("34" "12" "56" "45" "43")

assoc允許您從列表列表中構建鍵值表。 您可以添加幾個訪問函數來抽象出底層表示,如下所示:

(defun get-record (table key)
  (assoc key table :test #'string=))

(defun merge-records (record1 record2)
  (if (not record1) 
      record2
          (cons (car record1) 
        (union (cdr record1) (cdr record2) :test #'string=))))

(defun insert-record (table record)
  (cons (merge-records record (get-record table (car record))) table))

所以,使用你的測試數據:

(setf raw-data '(("abc" "12" "45")
    ("abc" "34" "56")
    ("abc" "45" "43")  ;; Note, duplicate value 45 to illustrate usage of union.
    ("cdb" "56" "78")
    ("deg" "90" "67")
    ("deg" "45" "34")))

將數據加載到表中:

(setf data-table (reduce  #'insert-record raw-data :initial-value '()))

打印表:

(mapcar (lambda (key) (get-record data-table key)) '("abc" "cdb" "deg"))

==> (("abc" "12" "34" "56" "45" "43") ("cdb" "78" "56") ("deg" "34" "45" "67" "90"))

當然,對於插入或查找值,alists效率不高。 但它們使用起來非常方便,因此典型的工作流程是使用alist解決方案開發您的解決方案,通過訪問功能抽象實際實現,然后,一旦您明確了解問題並確定實施,選擇更有效的數據結構 - 當然,如果它會對現實世界的表現產生影響。

在Racket中,這是Scheme的一種方言,它又是Lisp的一種方言,你可以通過使用哈希表來跟蹤具有相同第一個元素的列表之間的重復元素,使用第一個元素作為鍵來解決這個問題,通過折疊操作累積結果,最后映射鍵上的鍵/值對及其列表值。 這是如何做:

(define input
  '(("abc" "12" "45") ("abc" "34" "56") ("cdb" "56" "78")
    ("deg" "90" "67") ("deg" "45" "34")))

(hash-map
 (foldl (lambda (e h)
          (hash-update h (car e)
                       (lambda (p) (append (cdr e) p))
                       (const '())))
        (make-immutable-hash)
        input)
 cons)

結果是預期的,雖然合並列表中的元素以不同的順序出現(但這不應該是一個問題,如果需要,排序它們是微不足道的):

'(("deg" "45" "34" "90" "67") ("abc" "34" "56" "12" "45") ("cdb" "56" "78"))

在Common Lisp中,使用排序和尾遞歸的強力解決方案可以是:

(defun combine-duplicates (list)
  (labels ((rec (tail marker accum result)
             (cond ((not tail)
                    (append result (list accum)))
                   ((equal marker (caar tail))
                    (rec (cdr tail)  marker (append accum (cdar tail)) result))
                   (t
                    (rec (cdr tail) (caar tail) (car tail) (append result (list accum)))))))
    (if (not list) nil
        (let ((sorted-list (sort list #'string-lessp :key #'car)))
          (rec (cdr sorted-list) (caar sorted-list) (car sorted-list) nil)))))

如上所述,問題的輸入已經按第一個元素排序,這是一個利用這個事實的解決方案。 它只對輸入列表進行一次傳遞,以相反的順序構建結果列表,並返回( nreverse d)結果。

(defparameter *input* 
  '(("abc" "12" "45")
    ("abc" "34" "56")
    ("cdb" "56" "78")
    ("deg" "90" "67")
    ("deg" "45" "34")))

(defparameter *desired-output* 
  '(("abc" "12" "45" "34" "56")
    ("cdb" "56" "78")
    ("deg" "90" "67" "45" "34")))

(defun merge-duplicates (input) 
  ;; Start with the result being empty, and continue until there are
  ;; no more sublists in the input to process.  Since the result is
  ;; built up in reverse order, it is NREVERSEd for return.
  (do ((result '()))
      ((endp input) (nreverse result))
    ;; Each element from the input can be popped off, and should have
    ;; the form (key . elements).  
    (destructuring-bind (key &rest elements) (pop input)
      ;; The result list (except in the first iteration) has the form
      ;; ((key-x . elements-x) ...), so we check whether key is equal
      ;; to key-x.
      (if (equal key (first (first result)))
          ;; If it is, then replace elements-x with (append
          ;; elements-x elements).  (This keeps the merged lists in
          ;; order.  This is a bit wasteful; we could record all
          ;; these elements during traversal and only concatenate
          ;; once at the end, but it would complicate the return form
          ;; a bit.
          (setf (rest (first result))
                (append (rest (first result)) elements))
          ;; Otherwise, they're different, and we can just push (key
          ;; . elements) into the result list, since it marks the
          ;; beginning of a new sublist.  Since we destructively
          ;; update the tails, we do not want to put the cons from
          ;; the input into results, so we make a copy using (list*
          ;; key elements) (which is EQUAL to the thing we popped
          ;; from input.
          (push (list* key elements)
                result)))))

這是一個實際的例子,還有一個測試,以確保它返回一個正確的結果:

CL-USER> (problem *input*)
(("abc" "12" "45" "34" "56") ("cdb" "56" "78") ("deg" "90" "67" "45" "34"))

CL-USER> (equal (problem *input*) *desired-output*)
T

如果輸入具有形式((nil ...) ...) ,它將失敗,因為result最初nil並且(first (first result))將返回nil ,因此(equal key (first (first result)))將是真的, (setf (rest (rest ...)) ...)將嘗試訪問一個不可setf地方。 在創建合並尾部時也有點浪費,但從未指定這些元素的順序應該是什么,所以這至少會嘗試將它們保持在相同的順序。

Common Lisp,但既不快也不短。 您可以刪除copy-list並將原始文件移除,但是它可以生成給定共享結構的循環列表。 TEST關鍵字具有規范默認值。

(defun fixup-alist (old &key (test #'eql))
  "Combine OLD alist's duplicate keys."
  (let ((new (mapcar #'list
                     (delete-duplicates (mapcar #'car old)
                                        :test test))))
    (dolist (entry old new)
      (nconc (assoc (car entry) new
                    :test test)
             (copy-list (cdr entry))))))
FIXUP-ALIST
CL-USER> (fixup-alist x)
(("abc" "12" "45") ("abc" "34" "56") ("cdb" "56" "78") ("deg" "90" "67") ("deg" "45" "34"))
CL-USER> (fixup-alist x :test #'string=)
(("abc" "12" "45" "34" "56") ("cdb" "56" "78") ("deg" "90" "67" "45" "34"))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM