简体   繁体   English

如何在Vim中将cp1250特定字符替换为utf-8

[英]How to substitute cp1250 specific characters to utf-8 in Vim

I have some central european characters in cp1250 encoding in Vim. 我在Vim的cp1250编码中有一些中欧字符。 When I change encoding with set encoding=utf-8 they appear like <d0> and such. 当我使用set encoding = utf-8更改编码时,它们看起来像<d0>类。 How can I substitute over the entire file those characters for what they should be, ie Đ, in this case? 在这种情况下,如何在整个文件中替换这些字符(即Đ)?

As sidyll said, you should really use iconv for the purpose. 正如sidyll所说,您确实应该为此目的使用iconv。 Iconv knows stuff. Iconv知道一些东西。 It knows all the hairy encodings, onscure code-points, katakana, denormalized, canonical forms, compositions, nonspacing characters and the rest. 它知道所有有毛的编码,onscure代码点,片假名,非规范化,规范形式,成分,不间隔字符以及其他内容。

:%!iconv --from-code cp1250 --to-code utf-8

or shorter 或更短

:%!iconv -f cp1250 -t utf-8

to filter the whole buffer. 过滤整个缓冲区。 If you do 如果你这样做

:he xxd

You'll get a sample of how to automatically encode on buffer load/save if you wanted. 如果需要,您将获得有关如何在缓冲区加载/保存时自动编码的示例。

iconv -l will list you all (many: 1168 on my system) encodings it accepts/knows about. iconv -l将列出您接受/知道的所有编码(在我的系统上为1168)。

Happy hacking! 骇客入侵!

The iconv() function may be useful: iconv()函数可能有用:

iconv({expr}, {from}, {to})             *iconv()*
        The result is a String, which is the text {expr} converted
        from encoding {from} to encoding {to}.
        When the conversion fails an empty string is returned.
        The encoding names are whatever the iconv() library function
        can accept, see ":!man 3 iconv".
        Most conversions require Vim to be compiled with the |+iconv|
        feature.  Otherwise only UTF-8 to latin1 conversion and back
        can be done.
        This can be used to display messages with special characters,
        no matter what 'encoding' is set to.  Write the message in
        UTF-8 and use:
            echo iconv(utf8_str, "utf-8", &enc)
        Note that Vim uses UTF-8 for all Unicode encodings, conversion
        from/to UCS-2 is automatically changed to use UTF-8.  You
        cannot use UCS-2 in a string anyway, because of the NUL bytes.
        {only available when compiled with the +multi_byte feature}

You can set encoding to the value of your file's encoding and termencoding to UTF-8. 您可以将encoding设置为文件encoding的值,并将termencoding为UTF-8。 See The vim mbyte documentation . 请参见vim MB文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM