简体   繁体   English

从 csv 文件中删除非 ascii 特殊字符

[英]non ascii special char remove from csv file

While i am editing csv file in linux special character look like £stackoverflow, £unixbox,£query.当我在 linux 特殊字符中编辑 csv 文件时,看起来像£stackoverflow、£unixbox、£query。 My query is how to remove  from csv file.我的查询是如何从 csv 文件中删除 Â。

Input: £stackoverflow, £unixbox,£query Output: £stackoverflow, £unixbox,£query输入:£stackoverflow、£unixbox、£query 输出:£stackoverflow、£unixbox、£query

Observations of linux box: currently linux window translation setting is ISO-8859-1, while i am changing the window setting--->translation-->UTF-8 then open the same file using vi editior  char being disappeared.I have tried iconv command as well but didn't work.It may be the reason that i am conv the file ISO-8859-1 to UTF-8 but by default setting of linux is ISO-8859-1 so it is showing me  it is not removing this char.How to handle it to remove the same. linux box的观察:目前linux窗口翻译设置是ISO-8859-1,而我正在改变窗口设置--->翻译-->UTF-8然后使用vi编辑器打开相同的文件Â字符消失了。我有也尝试过 iconv 命令,但没有用。这可能是我将文件 ISO-8859-1 转换为 UTF-8 的原因,但默认情况下 linux 的设置是 ISO-8859-1 所以它向我展示了它不删除此字符。如何处理它以删除相同的字符。

You can try the below Perl solution.您可以尝试以下 Perl 解决方案。 This removes all the ordinal values that are not in the range of 32 to 127 (which contains the ascii text)这将删除所有不在 32 到 127 范围内的序数值(包含 ascii 文本)

$ echo "£stackoverflow, £unixbox,£query Output: £stackoverflow, £unixbox,£query" | perl -pe ' s/[^\x20-\x7f]//g '
stackoverflow, unixbox,query Output: stackoverflow, unixbox,query
$

EDIT:编辑:

To remove just Â, use要仅删除 Â,请使用

$ echo "Â" | perl -pe ' s/./sprintf("%x |",ord($&))/eg '  # Find the underlying ordinal values for  
c3 |82 |

$ echo "£stackoverflow, £unixbox,£query" | perl -pe ' s/\xc3\x82//g ' #removing it using s///
£stackoverflow, £unixbox,£query

$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM