简体   繁体   English

c字符串中字符串的多个替换

[英]c string multiple replacements within a character string

Lets say I have a string: 可以说我有一个字符串:

"(aaa and bbb or (aaa or aaa or bbb))"

**for simplicity sake, this will always be the format of the string, always 3 a's followed by a space or ')' or 3b's followed by a space or ')'. **为简单起见,这将始终是字符串的格式,始终为3 a,后跟一个空格或')'或3b,后跟一个空格或')'。

what would be the best way to replace every occurence of 'aaa' with a '1' and everyoccurrence of 'bbb' with a '0' in C. ending string should look like: 在C中用“ 1”替换每次出现的“ aaa”和用“ 0”替换每次出现的“ bbb”的最佳方式是什么。结束字符串应如下所示:

"(1 and 0 or (1 or 1 or 0))"

EDIT Let me be more specific: 编辑让我更具体:

char* blah = (char *) malloc (8);
sprintf(blah, "%s", "aaa bbb");

blah = replace(blah);

how can I write replace so that it allocates space and stores a new string 我怎么写替换,以便它分配空间并存储新的字符串

"1 0"

The most efficient way is to use regex family which is POSIX. 最有效的方法是使用正则表达式系列,即POSIX。 Some implementation will build a proper automaton for the patterns. 一些实现将为模式构建适当的自动机。 Another way is to repeatedly use KMP or Boyer-Moore search, but you have to scan the string several times, which is less efficient. 另一种方法是重复使用KMP或Boyer-Moore搜索,但是您必须多次扫描字符串,这效率较低。 In addition, what results you want given such input: aa=1, ab=2, bb=3 on string "aabb"? 另外,给定这样的输入,您想要什么结果:字符串“ aabb”上的aa = 1,ab = 2,bb = 3?

By the way, when you implement this function, a cleaner solution is to allocate a new dynamic C string and not to modify the original string while replacing. 顺便说一句,当您实现此功能时,更干净的解决方案是分配新的动态C字符串,而不是在替换时修改原始字符串。 You could implement a in-place replacement, but that would be much more complicated. 您可以实现就地替换,但这将更加复杂。

regex_t r; regmatch_t match[2]; int last = 0;
regcomp(&r, "(aaa|bbb)", REG_EXTENDED);
insert(hashtable, "aaa", "0"); insert(hashtable, "bbb", "1");
while (regexec(&r, oristr, 1, match, 0) != REG_NOMATCH) {
  char *val;
  strncat(newstr, oristr + last, match->rm_so);
  lookup(hashtable, oristr + match->rm_so, match->rm_eo - match->rm_so, &val);
  last = match->rm_eo;
  strncat(newstr, val);
}
strcat(newstr, oristr + last);
oristr = realloc(oristr, strlen(newstr));
strcpy(oristr, newstr); free(newstr); regfree(&r);

In practical implementation, you should change the size of newstr dynamically. 在实际实现中,应动态更改newstr的大小。 You should record the end of newstr rather than using strcat/strlen. 您应该记录newstr的结尾,而不要使用strcat / strlen。 The source code may be buggy as I have not really tried it. 源代码可能是错误的,因为我还没有真正尝试过。 But the idea is there. 但是想法就在那里。 This is the most efficient implementation I can think of. 这是我能想到的最有效的实现。

For this specific case, a simple while/for loop would do the trick. 对于这种特定情况,简单的while / for循环就可以解决问题。 But it looks like a homework problem, so I won't write it explicitly for you. 但这看起来像是一个作业问题,因此我不会为您明确编写。 Had more generic string manipulations be required, I would use pcre. 如果需要更多的通用字符串操作,我将使用pcre。

Here it is without memory limitation: 这里没有内存限制:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

/* ---------------------------------------------------------------------------
  Name       : replace - Search & replace a substring by another one. 
  Creation   : Thierry Husson, Sept 2010
  Parameters :
      str    : Big string where we search
      oldstr : Substring we are looking for
      newstr : Substring we want to replace with
      count  : Optional pointer to int (input / output value). NULL to ignore.  
               Input:  Maximum replacements to be done. NULL or < 1 to do all.
               Output: Number of replacements done or -1 if not enough memory.
  Returns    : Pointer to the new string or NULL if error.
  Notes      : 
     - Case sensitive - Otherwise, replace functions "strstr" by "strcasestr"
     - Always allocate memory for the result.
--------------------------------------------------------------------------- */
char* replace(const char *str, const char *oldstr, const char *newstr, int *count)
{
   const char *tmp = str;
   char *result;
   int   found = 0;
   int   length, reslen;
   int   oldlen = strlen(oldstr);
   int   newlen = strlen(newstr);
   int   limit = (count != NULL && *count > 0) ? *count : -1; 

   tmp = str;
   while ((tmp = strstr(tmp, oldstr)) != NULL && found != limit)
      found++, tmp += oldlen;

   length = strlen(str) + found * (newlen - oldlen);
   if ( (result = (char *)malloc(length+1)) == NULL) {
      fprintf(stderr, "Not enough memory\n");
      found = -1;
   } else {
      tmp = str;
      limit = found; /* Countdown */
      reslen = 0; /* length of current result */ 
      /* Replace each old string found with new string  */
      while ((limit-- > 0) && (tmp = strstr(tmp, oldstr)) != NULL) {
         length = (tmp - str); /* Number of chars to keep intouched */
         strncpy(result + reslen, str, length); /* Original part keeped */ 
         strcpy(result + (reslen += length), newstr); /* Insert new string */
         reslen += newlen;
         tmp += oldlen;
         str = tmp;
      }
      strcpy(result + reslen, str); /* Copies last part and ending nul char */
   }
   if (count != NULL) *count = found;
   return result;
}


/* ---------------------------------------------------------------------------
   Samples
--------------------------------------------------------------------------- */
int main(void)
{
   char *str, *str2;
   int rpl;

   /* ---------------------------------------------------------------------- */
   /* Simple sample */
   rpl = 0; /* Illimited replacements */
   str = replace("Hello World!", "World", "Canada", &rpl);
   printf("Replacements: %d\tResult: [%s]\n\n", rpl, str);
   /* Replacements: 1        Result: [Hello Canada!] */
   free(str);

   /* ---------------------------------------------------------------------- */
   /* Sample with dynamic memory to clean */
   rpl = 0; /* Illimited replacements */
   str = strdup("abcdef");
   if ( (str2 = replace(str, "cd", "1234", &rpl)) != NULL ) {
      free(str);
      str = str2;
   }
   printf("Replacements: %d\tResult: [%s]\n\n", rpl, str);
   /* Replacements: 1        Result: [ab1234ef] */
   free(str);

   /* ---------------------------------------------------------------------- */
   /* Illimited replacements - Case sensitive & Smaller result */
   str = replace("XXXHello XXXX world XX salut xxx monde!XXX", "XXX", "-",NULL);
   printf("Result: [%s]\n\n", str);
   /* Result: [-Hello -X world XX salut xxx monde!-] */
   free(str);

   /* ---------------------------------------------------------------------- */
   rpl = 3; /* Limited replacements */
   str = replace("AAAAAA", "A", "*", &rpl);
   printf("Replacements: %d\tResult: [%s]\n\n", rpl, str);
   /* Replacements: 3        Result: [***AAA] */
   free(str);

  return 0;
}

This is in no way the worlds most elegant solution, and it also assumes that the ending string is always going to be a smaller size than the original, oh and I hardcoded the conversions, but hopefully it points you more or less in the right direction or gives you an idea to jump off from: 这绝不是世界上最优雅的解决方案,并且还假定结尾字符串的大小总是比原始字符串小。哦,我对转换进行了硬编码,但希望它或多或少地为您指明了正确的方向或为您提供一个跳出以下想法的想法:

char* replace( char *string ) {
    char *aaa = NULL;
    char *bbb = NULL;
    char *buffer = malloc( strlen( string ) );
    int length = 0;
    aaa = strstr( string, "aaa" );
    bbb = strstr( string, "bbb" );
    while ( aaa || bbb ) {
        if ( aaa && (bbb || aaa < bbb ) ) {
            char startToHere = aaa - string;
            strncpy( buffer, string, startToHere );
            string += startToHere;
            length += startToHere;
            buffer[length] = '1';
        }
        else if ( bbb ) {
            char startToHere = aaa - string;
            strncpy( buffer, string, startToHere );
            string += startToHere;
            length += startTohere;
            buffer[length] = '0';
        }
        aaa = strstr( string, "aaa" );
        bbb = strstr( string, "bbb" );
    }
    buffer[length] = '\0';
    string = realloc( string, length );
    strcpy( string, buffer );
    free( buffer );

    return string;
}

Disclaimer, I didn't even test this, but it should be at least semi in the direction of what you want. 免责声明,我什至都没有测试过,但这至少应该是您想要的方向上的一半。

This is a job for a FSM ! 这是FSM的工作!

#include <assert.h>
#include <stdio.h>
#include <string.h>

/*
//     | 0          | 1             | 2              | 3             | 4              |
// ----+------------+---------------+----------------+---------------+----------------+
// 'a' | 1          | 2             | ('1') 0        | ('b') 1       | ('bb') 1       |
// 'b' | 3          | ('a') 3       | ('aa') 3       | 4             | ('0') 0        |
// NUL | (NUL) halt | ('a'NUL) halt | ('aa'NUL) halt | ('b'NUL) halt | ('bb'NUL) halt |
// (*) | (*) 0      | ('a'*) 0      | ('aa'*) 0      | ('b'*) 0      | ('bb'*) 0      |
*/

void chg_data(char *src) {
  char *dst, ch;
  int state = 0;
  dst = src;
  for (;;) {
    ch = *src++;
    if (ch == 'a' && state == 0) {state=1;}
    else if (ch == 'a' && state == 1) {state=2;}
    else if (ch == 'a' && state == 2) {state=0; *dst++='1';}
    else if (ch == 'a' && state == 3) {state=1; *dst++='b';}
    else if (ch == 'a' && state == 4) {state=1; *dst++='b'; *dst++='b';}
    else if (ch == 'b' && state == 0) {state=3;}
    else if (ch == 'b' && state == 1) {state=3; *dst++='a';}
    else if (ch == 'b' && state == 2) {state=3; *dst++='a'; *dst++='a';}
    else if (ch == 'b' && state == 3) {state=4;}
    else if (ch == 'b' && state == 4) {state=0; *dst++='0';}
    else if (ch == '\0' && state == 0) {*dst++='\0'; break;}
    else if (ch == '\0' && state == 1) {*dst++='a'; *dst++='\0'; break;}
    else if (ch == '\0' && state == 2) {*dst++='a'; *dst++='a'; *dst++='\0'; break;}
    else if (ch == '\0' && state == 3) {*dst++='b'; *dst++='\0'; break;}
    else if (ch == '\0' && state == 4) {*dst++='b'; *dst++='b'; *dst++='\0'; break;}
    else if (state == 0) {state=0; *dst++=ch;}
    else if (state == 1) {state=0; *dst++='a'; *dst++=ch;}
    else if (state == 2) {state=0; *dst++='a'; *dst++='a'; *dst++=ch;}
    else if (state == 3) {state=0; *dst++='b'; *dst++=ch;}
    else if (state == 4) {state=0; *dst++='b'; *dst++='b'; *dst++=ch;}
    else assert(0 && "this didn't happen!");
  }
}

int main(void) {
  char data[] = "(aaa and bbb or (aaa or aaa or bbb))";
  printf("Before: %s\n", data);
  chg_data(data);
  printf(" After: %s\n", data);
  return 0;
}

You can try a loop for each replacement using the functions std::find and std::replace . 您可以使用功能std :: findstd :: replace尝试每次替换的循环。 You will find more informations about std::string here . 您将在此处找到有关std :: string的更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM