简体   繁体   English

什么'^ @'在vim中意味着什么?

[英]What does '^@' mean in vim?

When I cat a file in bash I get the following: 当我在bash中捕获文件时,我得到以下内容:

$ cat /tmp/file 
microsoft

When I view the same file in vim I get the following: 当我在vim中查看同一个文件时,我得到以下内容:

^@m^@i^@c^@r^@o^@s^@o^@f^@t^@

How can I identify and remove these "non-printable" characters. 如何识别和删除这些“不可打印”的字符。 What does '^@' mean in vim?? 什么'^ @'在vim中意味着什么?

(Just a piece of background information: the file was created by base 64 decoding and cutting from the pssh header of an mpd file for Microsoft Playready) (只是一个背景信息:该文件是由base 64解码并从Microsoft Playready的mpd文件的pssh头切割而创建的)

What you see is Vim's visual representation of unprintable characters . 你看到的是Vim对不可打印角色的直观表现。 It is explained at :help 'isprint' : 它解释为:help 'isprint'

 Non-printable characters are displayed with two characters: 0 - 31 "^@" - "^_" 32 - 126 always single characters 127 "^?" 128 - 159 "~@" - "~_" 160 - 254 "| " - "|~" 255 "~?" 

Therefore, ^@ stands for a null byte = 0x00. 因此, ^@代表空字节 = 0x00。 These (and other non-printable characters) can come from various sources, but in your case it's an ... 这些(和其他不可打印的字符)可以来自各种来源,但在你的情况下它是......

encoding issue 编码问题

If you clearly observe your output in Vim, every second byte is a null byte ; 如果你在Vim中清楚地观察你的输出,那么每个第二个字节都是一个空字节 ; in between are the expected characters. 中间是预期的字符。 This is a clear indication that the file uses a multibyte encoding ( utf-16 , big endian, no byte order mark to be precise), and Vim did not properly detect that, and instead opened the file as latin1 or so (whereas things worked out properly in the terminal). 这清楚地表明该文件使用多字节编码utf-16 ,big endian,没有精确的字节顺序标记),并且Vim没有正确检测到它,而是将文件打开为latin1左右(而事情有效)在终端正确地出去)。

To fix this, you can either explicitly specify the encoding: 要解决此问题,您可以显式指定编码:

:edit ++enc=utf-16 /tmp/file

Or tweak the 'fileencodings' option, so that Vim can automatically detect this. 或者调整'fileencodings'选项,以便Vim可以自动检测到这一点。 However, be aware that ambiguities (as in your case) make this prone to fail: 但是,请注意,模糊(如在您的情况下)使这容易失败:

For an empty file or a file with only ASCII characters most encodings will work and the first entry of 'fileencodings' will be used (except "ucs-bom", which requires the BOM to be present). 对于空文件或仅包含ASCII字符的文件,大多数编码都可以使用,并且将使用'fileencodings'的第一个条目(“ucs-bom”除外,这需要BOM存在)。

That's why a byte order mark (BOM) is recommended for 16-bit encodings; 这就是为什么建议16位编码使用字节顺序标记 (BOM); but that assumes that you have control over the output encoding. 但这假设您可以控制输出编码。

^@ is Vim's representation of a null byte. ^@是Vim表示空字节。 The ^ indicates a non-printable control character, with the following ASCII character indicating which control character it is. ^表示不可打印的控制字符,下面的ASCII字符表示它是哪个控制字符。

^@ == 0 (NUL)
^A == 1
^B == 2
...
^H == 8
^K == 11
...
^Z == 26
^[ == 27
^\ == 28
^] == 29
^^ == 30
^_ == 31
^? == 127

9 and 10 aren't escaped because they are Tab and Line Feed respectively. 9和10不会被转义,因为它们分别是制表符和换行符。

32 to 126 are printable ASCII characters (starting with Space). 32到126是可打印的ASCII字符(以Space开头)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM