简体   繁体   中英

How to Allocate Memory to a Pointer for Entered Input (in 'C')

I'm working on an assembler for a hypothetical machine (the SMAC-0 machine) and need some help with memory allocation.

I'll be getting and tokenizing strings from a given file and will save these tokens in pointers.

Here's a code snippet:

tokenCount = sscanf(buffer,"%s %s %s %s", tokenOne, tokenTwo, tokenThree, tokenFour);

where tokenCount is an integer, buffer is the temporary buffer that stores the line taken from the input file, and tokenOne , tokenTwo , tokenThree , and tokenFour are character pointers.

The strings accepted from the file can have one to four words:

Example:


            READ    N
    N:      DS      1
    SUM:    DS      1
    LOOP:   MOVER   AREG    N
            ADD     AREG    N
            COMP    AREG    ='5'
            BC      LE      LOOP
            MOVEM   AREG    SUM
            PRINT   SUM
            STOP

My queries are:

  • How can I find out how large the token is and thus know how to allocate memory for the respective token pointer?
  • (That question also applies to the buffer pointer, since the labels (eg LOOP , N , SUM ) can be of variable sizes.)

  • How can I, using scanf() or other input functions like gets() , do the same?
  • You should declare your token buffers large enough. To be on the safe side, it's a good idea to make all of them as large as the input buffer itself. See this this thread How to prevent scanf causing a buffer overflow in C? for more information.

    If you're using the GNU compiler, you can make use a extension which can dynamically allocate buffers on your behalf. Check out Dynamic allocation with scanf()

    EXAMPLES:

    Using predefined buffers for the scanned tokens:

    Note all tokens have the same size as the input buffer:

    /* sscanf-test.c */
    #include <stdio.h>
    
    int main(int argc, char** argv)
    {
      FILE *file = fopen("sample.txt", "r");
      const int BufferSize=256;
      char buffer[BufferSize];
      char tokenOne[BufferSize];
      char tokenTwo[BufferSize];
      char tokenThree[BufferSize];
      char tokenFour[BufferSize];
    
      while (fgets(buffer, sizeof(buffer), file) != NULL)
      {
        tokenOne[0]='\0';
        tokenTwo[0]='\0';
        tokenThree[0]='\0';
        tokenFour[0]='\0';
        int tokenCount = sscanf(buffer, "%s %s %s %s", tokenOne, tokenTwo, tokenThree, tokenFour);
        printf("scanned %d tokens   1:%s 2:%s 3:%s 4:%s\n", tokenCount, tokenOne, tokenTwo, tokenThree, tokenFour);
      }
    
      fclose(file);
      return 0;
    }
    

    The program produces the following output (I cleaned up the formatting a little bit to improve readability):

    gcc sscanf-test.c -o sscanf-test
    ./sscanf-test 
    scanned 2 tokens   1:READ   2:N    3:     4: 
    scanned 3 tokens   1:N:    2:DS    3:1    4: 
    scanned 3 tokens   1:SUM:  2:DS    3:1    4: 
    scanned 4 tokens   1:LOOP: 2:MOVER 3:AREG 4:N 
    scanned 3 tokens   1:ADD   2:AREG  3:N    4: 
    scanned 3 tokens   1:COMP  2:AREG  3:='5' 4: 
    scanned 3 tokens   1:BC    2:LE    3:LOOP 4: 
    scanned 3 tokens   1:MOVEM 2:AREG  3:SUM  4: 
    scanned 2 tokens   1:PRINT 2:SUM   3:     4: 
    scanned 1 tokens   1:STOP  2:      3:     4:

    If you want to store the scanned tokens for later processing, you'll have to copy them somewhere else in the while-loop. You can use the function strlen to get the size of the token (excluding the trailing string terminator '\\0').

    Using dynamic memory allocation for tokens:

    Like I said, you could also let scanf allocate buffers for you dynamically. The scanf(3) man page states that you can use GNU extensions 'a' or 'm' to do that. Specifically it says:

    An optional 'a' character. This is used with string conversions, and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead, scanf() allocates a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to a char * variable (this variable does not need to be initialized before the call). The caller should subsequently free(3) this buffer when it is no longer required. This is a GNU extension; C99 employs the 'a' character as a conversion specifier (and it can also be used as such in the GNU implementation)

    I couldn't get scanf to work using the 'a' modifier. However, there's also the 'm' modifier which does the same thing (and more):

    Since version 2.7, glibc also provides the m modifier for the same purpose as the a modifier. The m modifier has the following advantages:

    • It may also be applied to %c conversion specifiers (eg, %3mc).

    • It avoids ambiguity with respect to the %a floating-point conversion specifier (and is unaffected by gcc -std=c99 etc.)

    • It is specified in the upcoming revision of the POSIX.1 standard.

    /* sscanf-alloc.c */
    #include <stdio.h>
    #include <stdlib.h>
    
    int main(int argc, char **argv)
    {
      FILE *file = fopen("sample.txt", "r");
      const int BufferSize=64;
      char buffer[BufferSize];
      char *tokenOne   = NULL;
      char *tokenTwo   = NULL;
      char *tokenThree = NULL;
      char *tokenFour  = NULL;
    
      while (fgets(buffer, sizeof(buffer), file) != NULL)
      {
        // note: the '&', scanf requires pointers to pointer to allocate the buffers.
        int tokenCount = sscanf(buffer, "%ms %ms %ms %ms", &tokenOne, &tokenTwo, &tokenThree, &tokenFour);
        printf("scanned %d tokens   1:%s 2:%s 3:%s 4:%s\n", tokenCount, tokenOne, tokenTwo, tokenThree, tokenFour);
    
        // note: the memory has to be free'd to avoid leaks
        free(tokenOne);
        free(tokenTwo);
        free(tokenThree);
        free(tokenFour);
        tokenOne   = NULL;
        tokenTwo   = NULL;
        tokenThree = NULL;
        tokenFour  = NULL;
      }
    
      fclose(file);
      return 0;
    }
    
    gcc sscanf-alloc.c -o sscanf-alloc
    ./sscanf-alloc
    scanned 2 tokens   1:READ  2:N      3:(null) 4:(null)
    scanned 3 tokens   1:N:    2:DS     3:1      4:(null)
    scanned 3 tokens   1:SUM:  2:DS     3:1      4:(null)
    scanned 4 tokens   1:LOOP: 2:MOVER  3:AREG   4:N
    scanned 3 tokens   1:ADD   2:AREG   3:N      4:(null)
    scanned 3 tokens   1:COMP  2:AREG   3:='5'   4:(null)
    scanned 3 tokens   1:BC    2:LE     3:LOOP   4:(null)
    scanned 3 tokens   1:MOVEM 2:AREG   3:SUM    4:(null)
    scanned 2 tokens   1:PRINT 2:SUM    3:(null) 4:(null)
    scanned 1 tokens   1:STOP  2:(null) 3:(null) 4:(null)

    The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

     
    粤ICP备18138465号  © 2020-2024 STACKOOM.COM