如果sed linux mint 17中的模式匹配，则删除所有行

Question

我对shell脚本很陌生。

我正在抓取一个网站，并且抓取的文本包含很多重复。 例如，通常它们是论坛上的菜单。 通常，我是在Python中执行此操作，但我认为sed命令将使我免于读取和打印输入，循环等。我想从同一个文件中删除数千行重复的行。 我不想将其复制到另一个文件，因为最终会得到100个新文件。 以下是我从bash shell运行的影子脚本。

#!/bin/sed -f
sed -i '/^how$/d' input_file.txt
sed -i '/^is test$/d' input_file.txt
sed -i '/^repeated text/d' input_file.txt

这是输入文件的内容：

how to do this task
why it is not working
this is test
Stackoverflow is a very helpful community of programmers
that is test
this is text
repeated text is common
this is repeated text of the above line

然后我在shell中运行以下命令：

sed -f scriptFile input_file.txt

我收到以下错误

sed: scriptFile line 2: untermindated `s' command

如何更正脚本，以及使该脚本正常工作的命令的正确语法是什么？

非常感谢您的帮助。

Answer 1

假设您知道脚本在做什么，将它们放入脚本非常容易。 在您的情况下，脚本应为：

/^how$/d
/^is test$/d
/^repeated text/d

够了。

使脚本单独成为可执行文件也很容易：

#!/usr/bin/env sed -f
/^how$/d
/^is test$/d
/^repeated text/d

然后

chmod +x your_sed_script
./your_sed_script <old >new

这是一个非常好的和紧凑的教程。 您可以从中学到很多。

以下是该站点的示例，以防万一链接失效：

如果您有大量的sed命令，则可以将它们放入文件中并使用

sed -f sedscript <old >new

sedscript可能看起来像这样：

# sed comment - This script changes lower case vowels to upper case
s/a/A/g
s/e/E/g
s/i/I/g
s/o/O/g
s/u/U/g

Answer 2

例如，用egrep和mv进行操作会不会更容易

egrep -v 'pattern1|pattern2|pattern3|...' <input_file.txt >tmpfile.txt
mv tmpfile.txt input_file.txt

每个模式都将描述要删除的行，就像在sed中一样。 您最终不会获得其他文件，因为mv会删除它们。

如果模式太多，不想直接在命令行上指定它们，则可以使用egrep的-f选项将它们存储在文件中。

如果sed linux mint 17中的模式匹配，则删除所有行

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-08-12 01:43:16

解决方案2
0 2015-08-12 06:01:36

如果sed linux mint 17中的模式匹配，则删除所有行

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-08-12 01:43:16

解决方案2 0 2015-08-12 06:01:36

解决方案1
3 已采纳 2015-08-12 01:43:16

解决方案2
0 2015-08-12 06:01:36