简体   繁体   English

使用sscanf从字符串中删除目标字符

[英]Removing a target character from a string using sscanf

I've recently been learning about different conversion specifiers, but I am struggling to use one of the more complex conversion specifiers. 我最近一直在学习不同的转换说明符,但我正在努力使用一个更复杂的转换说明符。 The one in question being the bracket specifier (%[ set ]). 有问题的是括号说明符(%[ set ])。

To my understanding, from what I've read, using %[ set ] where any string matching the sequence of characters in set (the scanset) is consumed and assigned, and using %[^ set ] has the opposite effect; 根据我的理解,从我所读到的,使用%[ set ],其中任何匹配set(扫描集)中的字符序列的字符串被消耗和分配,并且使用%[^ set ]具有相反的效果; in essence consuming and assigning any string that does not contain the sequence of characters in the scanset. 本质上是消费和分配任何不包含扫描集中字符序列的字符串。

That's my understanding, albeit roughly explained. 这是我的理解,虽然粗略解释。 I was trying to use this specifier with sscanf to remove a specified character from a string using sscanf: 我试图使用sscanf的这个说明符来使用sscanf从字符串中删除指定的字符:

 sscanf(str_1, "%[^#]", str_2);

Suppose that str_1 contains "OH#989". 假设str_1包含“OH#989”。 My intention is to store this string in str_2, but removing the hash character in the process. 我的目的是将此字符串存储在str_2中,但删除进程中的哈希字符。 However, sscanf stops reading at the hash character, storing only "OH" when I am intending to store "OH989". 但是, sscanf停止读取哈希字符,当我打算存储“OH989”时仅存储“OH”。

Am I using the correct method in the wrong way, or am I using the wrong method altogether? 我是以错误的方式使用正确的方法,还是我使用了错误的方法? How can I correctly remove/extract a specified character from a string using sscanf ? 如何使用sscanf从字符串中正确删除/提取指定的字符? I know this is possible to achieve with other functions and operators, but ideally I am hoping to use sscanf . 我知道这可以通过其他函数和运算符来实现,但理想情况下我希望使用sscanf

The scanset matches a sequence of (one or more) characters that either do or don't match the contents of the scanset brackets. 扫描集匹配一个(一个或多个)字符序列,这些字符与扫描集括号的内容相匹配或不匹配。 It stops when it comes across the first character that isn't in the scanset. 当它遇到不在扫描集中的第一个字符时它会停止。 To get the two parts of your string, you'd need to use something like: 要获得字符串的两个部分,您需要使用以下内容:

sscanf(str_1, "%[^#]#%[^#]", str_2, str_3);

We can negotiate on the second conversion specification; 我们可以就第二个转换规范进行协商; it might be that %s is sufficient, or some other scanset is appropriate. 可能是%s足够,或者其他一些扫描设置是合适的。 But this would give you the 'before # ' and 'after # ' strings that could then be concatenated to give the desired result string. 但是这会给你'before # '和'after # '字符串,然后可以连接它们以提供所需的结果字符串。

I guess, if you really want to use sscanf for the purpose of removing a single target character, you could do this: 我想,如果你真的想使用sscanf去除一个目标角色,你可以这样做:

char str_2[strlen(str_1) + 1];
if (sscanf(str_1, "%[^#]", str_2) == 1) {
    size_t len = strlen(str_2);
    /* must verify if a '#' was found at all */
    if (str_1[len] != '\0') {
        strcpy(str_2 + len, str_1 + len + 1);
    }
} else {
    /* '#' is the first character */
    strcpy(str_2, str_1 + 1);
}

As you can see, sscanf is not the right tool for the job, because it has many quirks and shortcomings. 正如您所看到的, sscanf不适合这项工作,因为它有许多怪癖和缺点。 A simple loop is more efficient and less error prone. 简单的循环更有效,更不容易出错。 You could also parse str_1 into 2 separate strings with sscanf(str_1, "%[^#]#%[\\001-\\377]", str_2, str_3); 你也可以解析str_1与2个独立的串sscanf(str_1, "%[^#]#%[\\001-\\377]", str_2, str_3); and deal with the 3 possible return values: 并处理3个可能的返回值:

char str_2[strlen(str_1) + 1];
char str_3[strlen(str_1) + 1];
switch (sscanf(str_1, "%[^#]#%[\001-\377]", str_2, str_3)) {
  case 0:  /* empty string or initial '#' */
    strcpy(str_2, str_1 + (str_1[0] == '#'));
    break;
  case 1:  /* no '#' or no trailing part */
    break;
  case 2:  /* general case */
    strcat(str_2, str_3);
    break;
}
/* str_2 hold the result */

Removing a target character from a string using sscanf 使用sscanf从字符串中删除目标字符

sscanf() is not the best tool for this task, see far below. sscanf()不是这项任务的最佳工具,见下文。

// Not elegant code
// Width limits omitted for brevity.
str_2[0] = '\0';
char *p = str_2;

// Test for the end of the string
while (*str_1) {
  int n;  // record stopping offset
  int cnt = sscanf(str_1, "%[^#]%n", p, &n);
  if (cnt == 0) {  // first character is a #
    str_1++;  // advance to next
  } else {
    str_1 += n;  // advance n characters
    p += n; 
  }
}        

Simple loop: 简单循环:

Remove the needles from a haystack and save the hay in a bail. 从大海捞针中取出针头并将干草保存在保释中。

char needle = '#';
assert(needle);
do {
  while (*haystack == needle) haystack++;
} while (*bail++ = *haystack++);

With the 2nd method, code could use haystack = bail = str_1 使用haystack = bail = str_1方法,代码可以使用haystack = bail = str_1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM