简体   繁体   English

使用 Sed 删除任何特殊字符

[英]Delete any special character using Sed

I have yet another list of subdomain.我还有另一个子域列表。 I want to remove any Wildcard subdomain which include these special characters:我想删除任何包含这些特殊字符的Wildcard子域:

()!&$#*+?

Mostly, the data are prefixly random.大多数情况下,数据是前缀随机的。 Also, could be middle.此外,可能是中间。 Here's some sample of output data这是输出数据的一些示例

(www.imgur.com
***************diet.blogspot.com
*-1.gbc.criteo.com
------------------------------------------------------------i.imgur.com

This has been quite an inconvenience while scanning through the list.这在浏览列表时非常不便。 As always, I'm trying sed to fix it:与往常一样,我正在尝试 sed 来修复它:

sed -i "/[!()#$&?+]/d" foo.txt ###Didn't work
sed -i "/[\!\(\)\#\$\&\?\+]/d" ###Escaping char didn't work

Performing commands above still result in an unchanged list and the file still on original state.执行上述命令仍会导致列表未unchanged且文件仍处于原始状态。 I'm thinking that;我在想; to fix this is to pipe series of sed command in order to remove it one by one:解决此问题的方法是通过管道传输一系列sed命令,以便将其一一删除:

cat foo.txt | sed -e "/!/d" -e "/#/d" -e "/\*/d" -e "/\$/d" -e "/(/d" -e "/)/d" -e "/+/d" -e "/\'/d" -e "/&/d" >> foo2.txt
cat foo.txt | sed -e "/\!/d" | sed -e "/\#/d" | sed -e "/\*/d" | sed -e "/\$/d" | sed -e "/\+/d" | sed -e "/\'/d" | sed -e "/\&/d" >> foo2.txt

If escaping all special char doesn't work, it must've been my false logic.如果转义所有特殊字符不起作用,那一定是我的错误逻辑。 Also tried with /g still doesn't increase my luck.也尝试过/g仍然不会增加我的运气。

As a side note: I don't want - to be deleted as some valid subdomain can have - character:附带说明:我不希望-被删除,因为某些有效的子域可以具有-字符:

line-apps.com
line-apps-beta.com
line-apps-rc.com
line-apps-dev.com

Any help would be cherished.任何帮助都会受到珍惜。

Using sed使用sed

$ sed '/[[:punct:]]/d' input_file

This should delete all lines with special characters, however, it would help if you provided sample data.这应该删除所有带有特殊字符的行,但是,如果您提供示例数据会有所帮助。

End-up using single-quotation '' mentioned by @potong最终使用@potong提到的单引号''

sed '/[\!\?\+\,\#\$\&\*\(\)\[\]\ ]/d'

No idea why it does that but shell is always the target to blame.不知道为什么会这样,但 shell 总是要归咎于目标。

To do what you're trying to do in your answer (which adds [ and ] and more to the set of characters in your question) would be:在你的答案中做你想做的事情(在你的问题中添加[]等字符)将是:

sed '/[][!?+,#$&*() ]/d'

or just:要不就:

grep -v '[][!?+,#$&*() ]'

Per POSIX to include ] in a bracket expression it must be the first character otherwise it indicates the end of the bracket expression.根据 POSIX,要在括号表达式中包含] ,它必须是第一个字符,否则它表示括号表达式的结尾。

Consider printing lines you want instead of deleting lines you do not want, though, eg:不过,请考虑打印您想要的行,而不是删除您不想要的行,例如:

grep '^[[:alnum:]_.-]$' file

to print lines that only contain letters, numbers, underscores, dashes, and/or periods.打印仅包含字母、数字、下划线、破折号和/或句点的行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM