简体   繁体   English

使用linux脚本删除文本文件中的所有超链接

[英]Remove all hyperlinks in a text file, linux scripting

I am very new in scripting, but I want to learn it. 我在脚本编写方面还很陌生,但是我想学习它。 What I have to do is to remove all occurrences of something like http://* from a text file. 我要做的是从文本文件中删除所有出现的类似http:// *的东西。 I want to do it with sed command and regular expressions. 我想用sed命令和正则表达式来做到这一点。

Here is what I have come up to so far: 到目前为止,我得出的结论是:

sed 's/http:\/\/.*/ /' < input.txt > output.txt

This code replaces all the hyperlinks with a space. 此代码用空格替换所有超链接。 But the problem is that it also removes the rest of the line. 但是问题在于它也删除了其余的行。

How can I fix this problem? 我该如何解决这个问题? I have tried adding space, "http://.* " or end of word "http://.*\\>" or other tricks that I found in the internet, but they didn't work. 我尝试添加空格,“ http://.*”或单词“ http://.* \\>”的结尾或我在互联网上发现的其他技巧,但它们没有起作用。

And is there a better way to do so instead of using sed? 有没有比sed更好的方法呢?

Sed is a fine way to do this. Sed是执行此操作的好方法。 Try changing your regex to s!http://[^[:space:]]*! !g 尝试将您的正则表达式更改为s!http://[^[:space:]]*! !g s!http://[^[:space:]]*! !g . s!http://[^[:space:]]*! !g

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM