简体   繁体   English

vim或gcc的无法解释的行为

[英]unexplained behaviour of vim or gcc

I ran this simple program: 我运行这个简单的程序:

#include <iostream>
#include <string>
using namespace std;

#include <boost/regex.hpp>

int main ()
{

//    boost::regex fullname_regex ("[A–Z]+[a–z]*, [A-Z][a–z]*");
boost::regex fullname_regex ("[A-Z]+[a-z]*, [A-Z][a-z]*");

string name;
cout << "Enter you full name: " << flush;

getline (cin, name);
if (! regex_match (name, fullname_regex))
{
    cout << "Error: name not entered correctly" << endl;
}

return 0;
}

which I just copied from somewhere. 我刚从某处复制过的。 When I uncomment the commented line (part of the original copy/paste) and comment the next one (typed by myself) the program always rejects the name. 当我取消注释注释行(原始复制/粘贴的一部分)并注释下一个(由我自己键入)时,程序始终拒绝该名称。 Otherwise it works as expected. 否则它按预期工作。 I am using vim. 我正在使用vim。 I did :set list to see hidden characters and the lines are identical. 我做了:set list以查看隐藏的字符,并且行是相同的。 I inserted a long comment before the original line in order to move it down, suspecting a disk fault (very old system), but still I got the same error. 我在原始行之前插入一个长注释,以便将其向下移动,怀疑磁盘故障(非常旧的系统),但我仍然得到了同样的错误。 This is an ubuntu server with no gui, I use putty to do this. 这是一个没有gui的ubuntu服务器,我用putty来做这个。 I am not accustomed with such problems under linux, if anybody has any idea about what could explain this strange behaviour, please let me know. 我不习惯linux下的这类问题,如果有人知道什么可以解释这种奇怪的行为,请告诉我。 Maybe vim still uses some options from the original page, which is here and is formatted, indeed, but :set list does not show them? 也许vim仍然使用原始页面中的一些选项,这是在这里并且格式化,确实,但是:set list不显示它们?

The dashes are not the same. 破折号不一样。 The commented ones are longer and represented by different characters and thus interpreted differently. 评论的那些更长并且由不同的字符表示,因此被不同地解释。 Common Copy+Paste error. 常见复制+粘贴错误。

http://en.wikipedia.org/wiki/Dash http://en.wikipedia.org/wiki/Dash

That - character in the commented out line is U+2013 EN DASH , not the ASCII dash U+002d. 那个-注释掉的行中的字符是U + 2013 EN DASH ,而不是ASCII短划线U + 002d。

Because of the limited bitmap font I'm using, the Unicode character already stuck out when opening the file, but you can use the g8 command to print the UTF-8 encoding values of the character under the cursor, or use :call search('[^\\x00-\\x7F]') to locate the next non-ASCII character. 由于我使用的位图字体有限,打开文件时Unicode字符已经突出,但您可以使用g8命令打印光标下字符的UTF-8编码值,或使用:call search('[^\\x00-\\x7F]')找到下一个非ASCII字符。

When I pasted your code in my text editor, I saw immediately that your first [AZ] in the commented line is actually using a long dash. 当我在我的文本编辑器中粘贴代码时,我立即看到注释行中的第一个[AZ]实际上是使用长划线。

You want a simple dash, which is what you typed. 你想要一个简单的破折号,这就是你输入的内容。

You seem to be confused about the purpose of :set list . 您似乎对以下目的感到困惑:set list It is not designed to show "weird" characters in general: only a very small set (tabs, non-breaking spaces, trailing spaces…), see :help 'list' for the details. 它不是为了显示“怪异”字符而设计的:只有一个非常小的集合(制表符,不间断空格,尾随空格......),请参阅:help 'list'了解详细信息。

set list wouldn't have helped in this case. 在这种情况下, set list不会有帮助。

Regular spaces turned into non-breaking spaces is taken care of by set list but there are other special characters you should worry about when copy-pasting from the web or a PDF or mail clients and text processors: " are often replaced by , ' by ' and so on… The other day I had a long paragraph where all the ' or ' where replaced with ¹ . It was easy to spot in that case but could be missed easily in others. 常规空间变成不间断的空间由set list但是当从Web或PDF或邮件客户端和文本处理器进行复制粘贴时,您应该担心其他特殊字符: "通常被替换为'通过'等等...前几天我有一个很长的段落,其中所有''¹替换。在这种情况下很容易发现,但在其他情况下很容易错过。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM