简体   繁体   中英

Removing a target character from a string using sscanf

I've recently been learning about different conversion specifiers, but I am struggling to use one of the more complex conversion specifiers. The one in question being the bracket specifier (%[ set ]).

To my understanding, from what I've read, using %[ set ] where any string matching the sequence of characters in set (the scanset) is consumed and assigned, and using %[^ set ] has the opposite effect; in essence consuming and assigning any string that does not contain the sequence of characters in the scanset.

That's my understanding, albeit roughly explained. I was trying to use this specifier with sscanf to remove a specified character from a string using sscanf:

 sscanf(str_1, "%[^#]", str_2);

Suppose that str_1 contains "OH#989". My intention is to store this string in str_2, but removing the hash character in the process. However, sscanf stops reading at the hash character, storing only "OH" when I am intending to store "OH989".

Am I using the correct method in the wrong way, or am I using the wrong method altogether? How can I correctly remove/extract a specified character from a string using sscanf ? I know this is possible to achieve with other functions and operators, but ideally I am hoping to use sscanf .

The scanset matches a sequence of (one or more) characters that either do or don't match the contents of the scanset brackets. It stops when it comes across the first character that isn't in the scanset. To get the two parts of your string, you'd need to use something like:

sscanf(str_1, "%[^#]#%[^#]", str_2, str_3);

We can negotiate on the second conversion specification; it might be that %s is sufficient, or some other scanset is appropriate. But this would give you the 'before # ' and 'after # ' strings that could then be concatenated to give the desired result string.

I guess, if you really want to use sscanf for the purpose of removing a single target character, you could do this:

char str_2[strlen(str_1) + 1];
if (sscanf(str_1, "%[^#]", str_2) == 1) {
    size_t len = strlen(str_2);
    /* must verify if a '#' was found at all */
    if (str_1[len] != '\0') {
        strcpy(str_2 + len, str_1 + len + 1);
    }
} else {
    /* '#' is the first character */
    strcpy(str_2, str_1 + 1);
}

As you can see, sscanf is not the right tool for the job, because it has many quirks and shortcomings. A simple loop is more efficient and less error prone. You could also parse str_1 into 2 separate strings with sscanf(str_1, "%[^#]#%[\\001-\\377]", str_2, str_3); and deal with the 3 possible return values:

char str_2[strlen(str_1) + 1];
char str_3[strlen(str_1) + 1];
switch (sscanf(str_1, "%[^#]#%[\001-\377]", str_2, str_3)) {
  case 0:  /* empty string or initial '#' */
    strcpy(str_2, str_1 + (str_1[0] == '#'));
    break;
  case 1:  /* no '#' or no trailing part */
    break;
  case 2:  /* general case */
    strcat(str_2, str_3);
    break;
}
/* str_2 hold the result */

Removing a target character from a string using sscanf

sscanf() is not the best tool for this task, see far below.

// Not elegant code
// Width limits omitted for brevity.
str_2[0] = '\0';
char *p = str_2;

// Test for the end of the string
while (*str_1) {
  int n;  // record stopping offset
  int cnt = sscanf(str_1, "%[^#]%n", p, &n);
  if (cnt == 0) {  // first character is a #
    str_1++;  // advance to next
  } else {
    str_1 += n;  // advance n characters
    p += n; 
  }
}        

Simple loop:

Remove the needles from a haystack and save the hay in a bail.

char needle = '#';
assert(needle);
do {
  while (*haystack == needle) haystack++;
} while (*bail++ = *haystack++);

With the 2nd method, code could use haystack = bail = str_1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM