I have a text with a lots of lines, my question is how to delete the repeat lines in emacs? using the command in emacs or elisp packages without external utils.
for example:
this is line a
this is line b
this is line a
to remove the 3rd line (same as 1st line)
this is line a
this is line b
If you have Emacs 24.4 or newer, the cleanest way to do it would be the new delete-duplicate-lines
function. Note that
For example, if your input is
test
dup
dup
one
two
one
three
one
test
five
Mx delete-duplicate-lines
would make it
test
dup
one
two
three
five
You've the option of searching from backwards by prefixing it with the universal argument ( Cu
). The result would then be
dup
two
three
one
test
five
Credit goes to emacsredux.com .
Other roundabout options, not giving quite the same result, available via Eshell:
sort -u
; doesn't maintain the relative order of the originals uniq
; worse it needs its input to be sorted Put this code to your .emacs:
(defun uniq-lines (beg end)
"Unique lines in region.
Called from a program, there are two arguments:
BEG and END (region to sort)."
(interactive "r")
(save-excursion
(save-restriction
(narrow-to-region beg end)
(goto-char (point-min))
(while (not (eobp))
(kill-line 1)
(yank)
(let ((next-line (point)))
(while
(re-search-forward
(format "^%s" (regexp-quote (car kill-ring))) nil t)
(replace-match "" nil nil))
(goto-char next-line))))))
Usage:
M-x uniq-lines
In linux, select region, and type
M-| uniq <RETURN>
The result without duplicates are in new buffer.
(defun unique-lines (start end)
"This will remove all duplicating lines in the region.
Note empty lines count as duplicates of the empy line! All empy lines are
removed sans the first one, which may be confusing!"
(interactive "r")
(let ((hash (make-hash-table :test #'equal)) (i -1))
(dolist (s (split-string (buffer-substring-no-properties start end) "$" t)
(let ((lines (make-vector (1+ i) nil)))
(maphash
(lambda (key value) (setf (aref lines value) key))
hash)
(kill-region start end)
(insert (mapconcat #'identity lines "\n"))))
(setq s ; because Emacs can't properly
; split lines :/
(substring
s (position-if
(lambda (x)
(not (or (char-equal ?\n x) (char-equal ?\r x)))) s)))
(unless (gethash s hash)
(setf (gethash s hash) (incf i))))))
An alternative:
\\n
(UNIX-style). Which may be a bonus or a disadvantage, depending on your situation. split-string
in a way that it accepts characters instead of regular expression. Somewhat longer, but, perhaps, a bit more efficient variant:
(defun split-string-chars (string chars &optional omit-nulls)
(let ((separators (make-hash-table))
(last 0)
current
result)
(dolist (c chars) (setf (gethash c separators) t))
(dotimes (i (length string)
(progn
(when (< last i)
(push (substring string last i) result))
(reverse result)))
(setq current (aref string i))
(when (gethash current separators)
(when (or (and (not omit-nulls) (= (1+ last) i))
(/= last i))
(push (substring string last i) result))
(setq last (1+ i))))))
(defun unique-lines (start end)
"This will remove all duplicating lines in the region.
Note empty lines count as duplicates of the empy line! All empy lines are
removed sans the first one, which may be confusing!"
(interactive "r")
(let ((hash (make-hash-table :test #'equal)) (i -1))
(dolist (s (split-string-chars
(buffer-substring-no-properties start end) '(?\n) t)
(let ((lines (make-vector (1+ i) nil)))
(maphash
(lambda (key value) (setf (aref lines value) key))
hash)
(kill-region start end)
(insert (mapconcat #'identity lines "\n"))))
(unless (gethash s hash)
(setf (gethash s hash) (incf i))))))
Another way:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.