如果sed linux mint 17中的模式匹配，則刪除所有行

Question

我對shell腳本很陌生。

我正在抓取一個網站，並且抓取的文本包含很多重復。 例如，通常它們是論壇上的菜單。 通常，我是在Python中執行此操作，但我認為sed命令將使我免於讀取和打印輸入，循環等。我想從同一個文件中刪除數千行重復的行。 我不想將其復制到另一個文件，因為最終會得到100個新文件。 以下是我從bash shell運行的影子腳本。

#!/bin/sed -f
sed -i '/^how$/d' input_file.txt
sed -i '/^is test$/d' input_file.txt
sed -i '/^repeated text/d' input_file.txt

這是輸入文件的內容：

how to do this task
why it is not working
this is test
Stackoverflow is a very helpful community of programmers
that is test
this is text
repeated text is common
this is repeated text of the above line

然后我在shell中運行以下命令：

sed -f scriptFile input_file.txt

我收到以下錯誤

sed: scriptFile line 2: untermindated `s' command

如何更正腳本，以及使該腳本正常工作的命令的正確語法是什么？

非常感謝您的幫助。

Answer 1

假設您知道腳本在做什么，將它們放入腳本非常容易。 在您的情況下，腳本應為：

/^how$/d
/^is test$/d
/^repeated text/d

夠了。

使腳本單獨成為可執行文件也很容易：

#!/usr/bin/env sed -f
/^how$/d
/^is test$/d
/^repeated text/d

然后

chmod +x your_sed_script
./your_sed_script <old >new

這是一個非常好的和緊湊的教程。 您可以從中學到很多。

以下是該站點的示例，以防萬一鏈接失效：

如果您有大量的sed命令，則可以將它們放入文件中並使用

sed -f sedscript <old >new

sedscript可能看起來像這樣：

# sed comment - This script changes lower case vowels to upper case
s/a/A/g
s/e/E/g
s/i/I/g
s/o/O/g
s/u/U/g

Answer 2

例如，用egrep和mv進行操作會不會更容易

egrep -v 'pattern1|pattern2|pattern3|...' <input_file.txt >tmpfile.txt
mv tmpfile.txt input_file.txt

每個模式都將描述要刪除的行，就像在sed中一樣。 您最終不會獲得其他文件，因為mv會刪除它們。

如果模式太多，不想直接在命令行上指定它們，則可以使用egrep的-f選項將它們存儲在文件中。

如果sed linux mint 17中的模式匹配，則刪除所有行

問題描述

2 個解決方案

解決方案1
3 已采納 2015-08-12 01:43:16

解決方案2
0 2015-08-12 06:01:36

如果sed linux mint 17中的模式匹配，則刪除所有行

問題描述

2 個解決方案

解決方案1 3 已采納 2015-08-12 01:43:16

解決方案2 0 2015-08-12 06:01:36

解決方案1
3 已采納 2015-08-12 01:43:16

解決方案2
0 2015-08-12 06:01:36