Remove all hyperlinks in a text file, linux scripting

Question

I am very new in scripting, but I want to learn it. What I have to do is to remove all occurrences of something like http://* from a text file. I want to do it with sed command and regular expressions.

Here is what I have come up to so far:

sed 's/http:\/\/.*/ /' < input.txt > output.txt

This code replaces all the hyperlinks with a space. But the problem is that it also removes the rest of the line.

How can I fix this problem? I have tried adding space, "http://.* " or end of word "http://.*\\>" or other tricks that I found in the internet, but they didn't work.

And is there a better way to do so instead of using sed?

Answer 1

Sed is a fine way to do this. Try changing your regex to s!http://[^[:space:]]*! !g s!http://[^[:space:]]*! !g .

Remove all hyperlinks in a text file, linux scripting

Question

1 answers

solution1
0 2013-10-31 17:40:40

Remove all hyperlinks in a text file, linux scripting

Question

1 answers

solution1 0 2013-10-31 17:40:40

solution1
0 2013-10-31 17:40:40