[英]regex unicode character in vim
I'm being an idiot.我是个白痴。
Someone cut and pasted some text from microsoft word into my lovely html files.有人从 microsoft word 中剪切并粘贴了一些文本到我可爱的 html 文件中。
I now have these unicode characters instead of regular quote symbols, (ie quotes appear as <92> in the text)我现在有这些 unicode 字符而不是常规引号符号,(即引号在文本中显示为 <92>)
I want to do a regex replace but I'm having trouble selecting them.我想做一个正则表达式替换,但我无法选择它们。
:%s/\u92/'/g
:%s/\u5C/'/g
:%s/\x92/'/g
:%s/\x5C/'/g
...all fail. ……都失败了。 My google-fu has failed me.
我的 google-fu 失败了。
From :help regexp
(lightly edited), you need to use some specific syntax to select unicode characters with a regular expression in Vim:从
:help regexp
(稍微编辑),您需要使用一些特定的语法在 Vim 中使用正则表达式选择 unicode 字符:
\%u match specified multibyte character (eg \%u20ac)
That is, to search for the unicode character with hex code 20AC, enter this into your search pattern:也就是说,要搜索十六进制代码 20AC 的 unicode 字符,请将其输入到您的搜索模式中:
\%u20ac
The full table of character search patterns includes some additional options:完整的字符搜索模式表包括一些附加选项:
\%d match specified decimal character (eg \%d123)
\%x match specified hex character (eg \%x2a)
\%o match specified octal character (eg \%o040)
\%u match specified multibyte character (eg \%u20ac)
\%U match specified large multibyte character (eg \%U12345678)
This solution might not address the problem as originally stated, but it does address a different but very closely related one and I think it makes a lot of sense to place it here.这个解决方案可能没有像最初所说的那样解决问题,但它确实解决了一个不同但非常密切相关的问题,我认为把它放在这里很有意义。
I don't know in which version of Vim it was implemented, but I was working on 7.4 when I tried it.我不知道它是在哪个版本的 Vim 中实现的,但我在尝试时正在 7.4 上工作。
When in Edit mode, the sequence to output unicode characters is: ctrl-v
u
xxxx
where xxxx
is the code point.在编辑模式下,输出 unicode 字符的顺序是:
ctrl-v
u
xxxx
其中xxxx
是代码点。 For instance outputting the euro sign would be ctrl-v
u
20ac
.例如,输出欧元符号将是
ctrl-v
u
20ac
。
I tried it in Command mode as well and it worked.我也在命令模式下尝试过它并且有效。 That is, to replace all instances of "20 euro" in my document with "20 €", I'd do:
也就是说,要将文档中所有“20 欧元”的实例替换为“20 欧元”,我会这样做:
:%s/20 euro/20 <ctrl-v u 20ac>/gc
In the above <ctrl-v u 20ac>
is not literal, it's the sequence of keys that will output the €
character.在上面的
<ctrl-v u 20ac>
不是文字,而是将输出€
字符的键序列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.