简体   繁体   中英

Segmentation fault when reading space separated values from file in C

I have an input file which contains space separated values

AA BB 4
A B
AB BA
AA CC
CC BB
A B 3 
A C
B C
c B

I want content of the first line moved to s1 , s2 and n respectively and next n lines contents 2 space separated strings these will move to production_left and production_right respectively. The layout will be repeated for subsequent blocks of lines also. The sample data has two blocks of input.

My code given below

int main()
{
    char *s1[20], *t1[20];
    char *production_left[20], *production_right[20];
    int n;
    FILE *fp;
    fp = fopen("suffix.txt","r");
    int iss=0;
    do{
         fscanf(fp,"%s %s %d",s1[iss],t1[iss],&n);
         int i=0;
         while(i<n)
         {
             fscanf(fp,"%s %s",production_left[i],production_right[i]);
             i++;
         }
    }while(!eof(fp));
}

Every time it is giving segmentation fault.

A lot of C, is just keeping clear in your mind, what data you need available in what scope (block) of your code. In your case, the data you are working with is the first (or header) line for each section of your data file. For that you have s1 and t1 , but you have nothing to preserve n so that it is available for reuse with your data. Since n holds the number of indexes to expect for production_left and production_right under each heading, simply create an index array, eg int idx[MAXC] = {0}; to store each n associated with each s1 and t1 . That way you preserve that value for use in iteration later on. ( MAXC is just a defined constant for 20 to prevent using magic numbers in your code)

Next, you need to turn to your understanding of pointer declarations and their use. char *s1[MAXC], *t1[MAXC], *production_left[MAXC], *production_right[MAXC]; declares 4 array of pointers ( 20 pointers each) for s1 , t1 , production_left and production_right . The pointers are uninitialized and unallocated . While each can be initialized to a pointer value , there is no storage (allocated memory) associated with any of them that would allow copying data. You can't simply use fscanf and assign the same pointer value to each (they would all end up pointing to the last value -- if it remained in scope)

So you have two choices, (1) either use a 2D array, or (2) allocate storage for each string and copy the string to the new block of memory and assign the pointer to the start of that block to (eg s1[x] ). The strdup function provides the allocate and copy in a single function call. If you do not have strdup , it is a simple function to write using strlen , malloc and memcopy (use memmove if there is a potential that the strings overlap).

Once you have identified what values you need to preserve for later use in your code, you have insured the variables declared are scoped properly, and you have insured that each is properly initialized and storage allocated, all that remains is writing the logic to make things work as you intend.

Before turning to an example, it is worth noting that you are interested in line-oriented input with your data. The scanf family provides formatted input , but often it is better to use a line-oriented function for the actual input (eg fgets ) and then a separate parse with, eg sscanf . In this case it is largely a wash since your first values are string values and the %s format specifier will skip intervening whitespace , but that is often not the case. For instance, you are effectively reading:

    char tmp1[MAXC] = "", tmp2[MAXC] = "";
    ...
    if (fscanf (fp, "%s %s %d", tmp1, tmp2, &n) != 3)
        break;

That can easily be replaced with a bit more robust:

    char buf[MAXC] = "", tmp1[MAXC] = "", tmp2[MAXC] = "";
    ...
    if (!fgets (buf, sizeof buf, fp) ||
        sscanf (buf, "%s %s %d", tmp1, tmp2, &n) != 3)
        break;

(which will validate both the read and the conversion separately)

Above, also note the use of temporary buffers tmp1 and tmp2 (and buf for use with fgets ) It is very often advantageous to read input into temporary values that can be validated before final storage for later use.

What remains is simply putting the pieces together in the correct order to accomplish your goals. Below is an example that reads data from the filename given as the first argument (or from stdin if no filename is given) and then outputs the data and frees all memory allocated before exiting.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXC 20

int main (int argc, char **argv) {

    char *s1[MAXC], *t1[MAXC],
        *production_left[MAXC], *production_right[MAXC];
    int idx[MAXC] = {0},              /* storage for 'n' values */
        iss = 0, ipp = 0, pidx = 0;
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    /* loop until no header row read, protecting array bounds */
    for (; ipp < MAXC && iss < MAXC; iss++) {
        char buf[MAXC] = "", tmp1[MAXC] = "", tmp2[MAXC] = "";
        int n = 0;

        if (!fgets (buf, sizeof buf, fp) ||
            sscanf (buf, "%s %s %d", tmp1, tmp2, &n) != 3)
            break;

        idx[iss] = n;
        s1[iss] = strdup (tmp1);    /* strdup - allocate & copy */
        t1[iss] = strdup (tmp2);

        if (!s1[iss] || !t1[iss]) { /* if either is NULL, handle error */
            fprintf (stderr, "error: s1 or s1 empty/NULL, iss: %d\n", iss);
            return 1;
        }

        /* read 'n' data lines from file, protecting array bounds */
        for (int i = 0; i < n && ipp < MAXC; i++, ipp++) {
            char ptmp1[MAXC] = "", ptmp2[MAXC] = "";

            if (!fgets (buf, sizeof buf, fp) ||
                sscanf (buf, "%s %s", ptmp1, ptmp2) != 2) {
                fprintf (stderr, "error: read failure, ipp: %d\n", iss);
                return 1;
            }
            production_left[ipp] = strdup (ptmp1);
            production_right[ipp] = strdup (ptmp2);

            if (!production_left[ipp] || !production_right[ipp]) {
                fprintf (stderr, "error: production_left or "
                        "production_right empty/NULL, iss: %d\n", iss);
                return 1;
            }
        }
    }

    if (fp != stdin) fclose (fp);     /* close file if not stdin */

    for (int i = 0; i < iss; i++) {
        printf ("%-8s %-8s  %2d\n", s1[i], t1[i], idx[i]);
        free (s1[i]);  /* free s & t allocations */
        free (t1[i]);
        for (int j = pidx; j < pidx + idx[i]; j++) {
            printf ("  %-8s %-8s\n", production_left[j], production_right[j]);
            free (production_left[j]);  /* free production allocations */
            free (production_right[j]);
        }
        pidx += idx[i];  /* increment previous index value */
    }

    return 0;
}

Example Input File

$ cat dat/production.txt
AA BB 4
A B
AB BA
AA CC
CC BB
A B 3
A C
B C
c B

Example Use/Output

$ ./bin/production <dat/production.txt
AA       BB         4
  A        B
  AB       BA
  AA       CC
  CC       BB
A        B          3
  A        C
  B        C
  c        B

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

$ valgrind ./bin/production <dat/production.txt
==3946== Memcheck, a memory error detector
==3946== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==3946== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==3946== Command: ./bin/production
==3946==
AA       BB         4
  A        B
  AB       BA
  AA       CC
  CC       BB
A        B          3
  A        C
  B        C
  c        B
==3946==
==3946== HEAP SUMMARY:
==3946==     in use at exit: 0 bytes in 0 blocks
==3946==   total heap usage: 18 allocs, 18 frees, 44 bytes allocated
==3946==
==3946== All heap blocks were freed -- no leaks are possible
==3946==
==3946== For counts of detected and suppressed errors, rerun with: -v
==3946== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Look things over and let me know if you have further questions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM