简体   繁体   English

如何删除emacs中的重复行

[英]how to delete the repeat lines in emacs

I have a text with a lots of lines, my question is how to delete the repeat lines in emacs? 我的文本中有很多行,我的问题是如何删除emacs中的重复行? using the command in emacs or elisp packages without external utils. 在没有外部utils的emacs或elisp软件包中使用命令。

for example: 例如:

this is line a
this is line b
this is line a

to remove the 3rd line (same as 1st line) 删除第三行(与第一行相同)

this is line a
this is line b

If you have Emacs 24.4 or newer, the cleanest way to do it would be the new delete-duplicate-lines function. 如果您使用的是Emacs 24.4或更高版本,最干净的方法是使用新的delete-duplicate-lines功能。 Note that 注意

  • this works on a region, not a buffer, so select the desired text first 这适用于区域,而不是缓冲区,因此请首先选择所需的文本
  • it maintains the relative order of the originals, killing the duplicates 它保持原件的相对顺序,杀死重复件

For example, if your input is 例如,如果您输入的是

test
dup
dup
one
two
one
three
one
test
five

Mx delete-duplicate-lines would make it Mx delete-duplicate-lines将使其

test
dup
one
two
three
five

You've the option of searching from backwards by prefixing it with the universal argument ( Cu ). 您可以选择以通用参数( Cu )为前缀从后向搜索。 The result would then be 结果将是

dup
two
three
one
test
five

Credit goes to emacsredux.com . 信誉归emacsredux.com所有

Other roundabout options, not giving quite the same result, available via Eshell: 其他回旋选项(效果不尽相同)可通过Eshell获得:

  1. sort -u ; sort -u ; doesn't maintain the relative order of the originals 不维持原件的相对顺序
  2. uniq ; uniq ; worse it needs its input to be sorted 更糟糕的是,它需要对其输入进行排序

Put this code to your .emacs: 将此代码放入您的.emacs:

(defun uniq-lines (beg end)
  "Unique lines in region.
Called from a program, there are two arguments:
BEG and END (region to sort)."
  (interactive "r")
  (save-excursion
    (save-restriction
      (narrow-to-region beg end)
      (goto-char (point-min))
      (while (not (eobp))
        (kill-line 1)
        (yank)
        (let ((next-line (point)))
          (while
              (re-search-forward
               (format "^%s" (regexp-quote (car kill-ring))) nil t)
            (replace-match "" nil nil))
          (goto-char next-line))))))

Usage: 用法:

M-x uniq-lines

In linux, select region, and type 在linux中,选择区域,然后键入

M-| uniq <RETURN>

The result without duplicates are in new buffer. 没有重复的结果将在新缓冲区中。

(defun unique-lines (start end)
  "This will remove all duplicating lines in the region.
Note empty lines count as duplicates of the empy line! All empy lines are 
removed sans the first one, which may be confusing!"
  (interactive "r")
  (let ((hash (make-hash-table :test #'equal)) (i -1))
    (dolist (s (split-string (buffer-substring-no-properties start end) "$" t)
               (let ((lines (make-vector (1+ i) nil)))
                 (maphash 
                  (lambda (key value) (setf (aref lines value) key))
                  hash)
                 (kill-region start end)
                 (insert (mapconcat #'identity lines "\n"))))
      (setq s                           ; because Emacs can't properly
                                        ; split lines :/
            (substring 
             s (position-if
                (lambda (x)
                  (not (or (char-equal ?\n x) (char-equal ?\r x)))) s)))
      (unless (gethash s hash)
        (setf (gethash s hash) (incf i))))))

An alternative: 替代:

  • Will not use undo history to store matches. 不会使用撤消历史记录来存储匹配项。
  • Will be in general faster (but if you are after ultimate speed - build a prefix tree). 通常会更快(但是如果您追求极限速度,请构建前缀树)。
  • Has an effect of replacing all former newline characters, whatever they were with \\n (UNIX-style). 可以用\\n (UNIX风格)替换所有以前的换行符。 Which may be a bonus or a disadvantage, depending on your situation. 根据您的情况,这可能是奖金还是劣势。
  • You could make it a little bit better (faster), if you re-implement split-string in a way that it accepts characters instead of regular expression. 如果以接受字符而不是正则表达式的方式重新实现split-string ,则可以使其更好(更快)。

Somewhat longer, but, perhaps, a bit more efficient variant: 更长一些,但是也许更有效:

(defun split-string-chars (string chars &optional omit-nulls)
  (let ((separators (make-hash-table))
        (last 0)
        current
        result)
    (dolist (c chars) (setf (gethash c separators) t))
    (dotimes (i (length string)
                (progn
                 (when (< last i)
                   (push (substring string last i) result))
                 (reverse result)))
      (setq current (aref string i))
      (when (gethash current separators)
        (when (or (and (not omit-nulls) (= (1+ last) i))
                  (/= last i))
          (push (substring string last i) result))
        (setq last (1+ i))))))

(defun unique-lines (start end)
  "This will remove all duplicating lines in the region.
Note empty lines count as duplicates of the empy line! All empy lines are 
removed sans the first one, which may be confusing!"
  (interactive "r")
  (let ((hash (make-hash-table :test #'equal)) (i -1))
    (dolist (s (split-string-chars
                (buffer-substring-no-properties start end) '(?\n) t)
               (let ((lines (make-vector (1+ i) nil)))
                 (maphash 
                  (lambda (key value) (setf (aref lines value) key))
                  hash)
                 (kill-region start end)
                 (insert (mapconcat #'identity lines "\n"))))
      (unless (gethash s hash)
        (setf (gethash s hash) (incf i))))))

Another way: 其他方式:

  1. Select a region of text. 选择一个文本区域。
  2. Ctrl-U (prefix), M-| Ctrl-U(前缀),M- | (shell-command-on-region), sort -u (the command to run on the selection and replace the selection with its output). (区域上的shell命令),对-u(在选择项上运行并将其替换为选择项的命令)进行排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM