[英]Remove special characters in linux files
I have a lot of files *.java, *.xml. 我有很多文件* .java,*。xml。 But a guy wrote some comments and Strings with spanish characters.
但是一个人用西班牙语字符写了一些注释和字符串。 I been searching on the web how to remove them.
我一直在网上搜索如何删除它们。
I tried find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java
我试图
find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java
find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java
just as an example, how can i remove these characters from many other files in subfolders? find . -type f -exec sed 's/[áíéóúñ]//g' DefaultAuthoritiesPopulator.java
例如,如何从子文件夹中的许多其他文件中删除这些字符?
Why are you trying to remove only characters with diacritic signs? 您为什么要尝试只删除带有变音符号的字符? It probably worth removing all characters with codes not in the range
0-127
, so removal regexp will be s/[\\0x80-\\0xFF]//g
if you're sure that your files should not contain higher ascii. 可能值得删除代码不在
0-127
范围内的所有字符,因此,如果您确定文件中不应包含较高的ascii,则删除regexp将为s/[\\0x80-\\0xFF]//g
。
If that's what you really want, you can use find
, almost as you are using it. 如果这是您真正想要的,则几乎可以在使用它时使用
find
。
find -type f \( -iname '*.java' -or -iname '*.xml' \) -execdir sed -i 's/[áíéóúñ]//g' '{}' ';'
The differences: 区别:
.
.
is implicit if no path is supplied. execdir
is more secure than exec
(read the man page). execdir
比exec
更安全(请阅读手册页)。 -i
tells sed
to modify the file argument in place. -i
告诉sed
修改文件参数。 Read the man page to see how to use it to make a backup. {}
represents a path argument which find
will substitute in. {}
表示一个路径参数, find
将替代该参数。 ;
;
is part of the find
syntax for exec
/ execdir
. exec
/ execdir
的find
语法的execdir
。 You're almost there :) 你快到了 :)
find . -type f -exec sed -i 's/[áíéóúñ]//g' {} \;
^^ ^^
From sed(1)
: 从
sed(1)
:
-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if extension supplied)
From find(1)
: 从
find(1)
:
-exec command ;
Execute command; true if 0 status is returned. All
following arguments to find are taken to be arguments to
the command until an argument consisting of `;' is
encountered. The string `{}' is replaced by the current
file name being processed everywhere it occurs in the
arguments to the command, not just in arguments where it
is alone, as in some versions of find. Both of these
constructions might need to be escaped (with a `\') or
quoted to protect them from expansion by the shell. See
the EXAMPLES section for examples of the use of the -exec
option. The specified command is run once for each
matched file. The command is executed in the starting
directory. There are unavoidable security problems
surrounding use of the -exec action; you should use the
-execdir option instead.
tr
is the tool for the job: tr
是完成这项工作的工具:
NAME
tr - translate or delete characters
SYNOPSIS
tr [OPTION]... SET1 [SET2]
DESCRIPTION
Translate, squeeze, and/or delete characters from standard input, writing to standard out‐
put.
-c, -C, --complement
use the complement of SET1
-d, --delete
delete characters in SET1, do not translate
-s, --squeeze-repeats
replace each input sequence of a repeated character that is listed in SET1 with a
single occurrence of that character
piping your input through tr -d áíéóúñ
will probably do what you want. 通过
tr -d áíéóúñ
输入的内容可能会满足您的要求。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.