简体   繁体   English

第一次出现,除了C中的转义字符

[英]First occurrence except escaped chars in C

How can I locate the first unescaped char in a str. 如何在str中找到第一个未unescaped字符。 In the following code, I get the first char at position 14, but I'm looking the one at position 26. 在下面的代码中,我在第14位获得了第一个字符,但我正在第26位寻找一个。

#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] = "FOO + pHWAx \\\"bar AER/2.1\" BAZ";
  printf ("%s\n",str);
  char * pch;
  pch=strchr(str,'"');
  printf ("found at %d\n",pch-str+1);
  return 0;
}

Use the strpbrk function to look for the first occurence of any one of several characters at once. 使用strpbrk函数一次查找多个字符中任何一个的第一次出现。 You must not skip the escape character; 您不能跳过转义符; you must check whether it is followed by the character that you're really looking for. 您必须检查它后面是否跟着您真正想要的字符。

Ie suppose we want to look for " which can be escaped as \\" . 即假设我们要查找"可以转为\\" Actually, this means we must look for either " or \\ . In other words: 实际上,这意味着我们必须寻找"\\ 。换句话说:

char *ptr = strpbrk(string, "\"\\"); /* look for chars in the set { ", \ } */

But we have to do this in a loop, because we are not interested in escaped quotes and have to keep going: 但是我们必须循环执行此操作,因为我们对转义引号不感兴趣,因此必须继续:

char *quote = 0;
char *string = str; /* initially points to the str array */

while (*string != 0) {
  char *ptr = strpbrk(string, "\"\\");

Next we check whether we found something: 接下来,我们检查是否找到了一些东西:

  if (!ptr)
    break;

If we found something is is necessarily a \\ or " : 如果我们发现某些东西一定是\\"

  if (*ptr == '"') {
    quote = ptr;
    break;
  }

If it is not a quote, then it must be an escape. 如果不是引号,则必须是转义符。 We increment to the next character. 我们增加到下一个字符。 If it is a terminating null it means we have a backslash at the end of a string: an improper escape. 如果它是一个终止null,则意味着在字符串的末尾有一个反斜杠:不正确的转义。

  if (*++ptr == 0)
    break;

Otherwise, we can skip the next character and continue the loop to scan for the next escape or unescaped quote. 否则,我们可以跳过下一个字符并继续循环以扫描下一个转义或未转义的引号。

  string = ++ptr;
}

If an unescaped quote occurs, then quote points to it after the execution of the while loop. 如果出现未转义的引号,则在执行while循环后, quote指向它。 Otherwise quote remains null. 否则, quote保持为空。

This code assumes that there exist other escapes besides \\" , but that they are all one character long, eg \\b or \\r . It will not work if there are longer escapes like \\xff . Escapes constitute the conventions of a language: you have to know what the language is that you're processing to do it correctly. 此代码假定\\"以外的其他转义符,但它们都长一个字符,例如\\b\\r 。如果存在更长的转义\\xff\\xff 。转义\\xff构成一种语言的约定:您必须知道您要正确执行处理的语言。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM