简体   繁体   English

在fork()子进程的strcmp()处发生SIGSEGV分段错误

[英]SIGSEGV Segmentation Fault at strcmp() on a forked() child process

So I am writing a program that iterates through each file/folder in a directory, and for every CSV file it encounters it will fork() and then sort it and output the CSV file. 因此,我正在编写一个程序,该程序遍历目录中的每个文件/文件夹,对于遇到的每个CSV文件,它将进行fork(),然后对其进行排序并输出CSV文件。

The sorting happens in beginSort() which runs mergesort on the CSV and then outputs. 排序发生在beginSort()中,后者在CSV上运行mergesort然后输出。

Here's the problem...when I run beginSort() on the default main parent process it works totally as expected. 这就是问题所在...当我在默认的主父进程上运行beginSort()时,它完全可以按预期工作。 However, when it is being run from a forked() child process, the code fails immediately after this line 170: 但是,当它从一个forked()子进程运行时,代码在此行170之后立即失败:

fgets(titleRow.rowValue, 999, fp);

I can't make sense of what is happening. 我无法理解正在发生的事情。 When I set a breakpoint before this line...GDB still just runs right through it and I can't can get meaningful information. 当我在此行之前设置断点时... GDB仍会直接在其中运行,而我无法获得有意义的信息。 Here is the full sorter.c file: 这是完整的sorter.c文件:

#include "Sorter.h"
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#include <unistd.h>
#include <dirent.h>


int isString(char* string) {

    int i = 0;
    int decimal = 0;

    // empty string
    if(string == NULL || string == ""){
        return 1;
    }

    // goes character by character to check if its a number or string
    while(i < strlen(string)){

        if( !isdigit(string[i])){
            if(string[i] == '.' && !decimal){
                if(i == strlen(string) - 1 || !isdigit(string[i + 1])){
                    return 1;
                }
                decimal = 1;
                i++;
                continue;
            }
            return 1;
        }
        i++;
    }
    return 0;
}

// trims trailing and leading blank spaces in a string
char* removeWhitespace(char *string, int i) {

  char *final;

  while(isspace((unsigned char)*string)){
    string++;
  }

  if(*string == 0) {
    return string;
  }
  final = string + i;
  while(final > string && isspace((unsigned char)*final)) {
      final--;
  }
  *(final+1) = 0;
  return string;
}

// splits row by commas and places them into the structs
char** customStrTok(char* line, int sortedColumnNum) {

    int i = 0;
    int j = 0;
    int k = 0;

    // stores resulting fields
    char** result = (char**)malloc(sizeof(char*) * (sortedColumnNum + 1));

    char* container = (char*)malloc(500);

    // checks for quotation marks in string
    int boolIsQuote = 0;

    //go through each character
    while(i < strlen(line)){

        if(line[i] == '"' && boolIsQuote == 0){
            boolIsQuote = 1;
        }

        else if(line[i] == '"' && boolIsQuote == 1){
            //store value in result
            result[k] = (char*) malloc((j + 1) * sizeof(char));
            container = removeWhitespace(container, j - 1);
            strcpy(result[k], container);
            memset(&container[0], 0, strlen(container));
            boolIsQuote = 0;
            j = 0;
            k++;
            i++;
        }

        //splits row by columns
        else if((line[i] == ',' || i == strlen(line) - 1) && boolIsQuote != 1){
            //if there is no character; (eg: ,,)
            if(!container){
                container[0] = '\0';
            }
            if(i == strlen(line) - 1 && line[i] != '\n'){
                container[j] = line[i];
                j++;
            }
            // copy into result array
            result[k] = (char*)malloc((j+1) * sizeof(char));
            container = removeWhitespace(container, j - 1);

            strcpy(result[k], container);

            memset(&container[0], 0, strlen(container));

            j = 0;
            k++;

            // if comma is at the end
            if(line[i] == ',' && i == strlen(line) - 2){

                container[0] = '\0';

                result[k] = (char*)malloc((j+1) * sizeof(char));

                strcpy(result[k], container);
                memset(&container[0], 0, strlen(container));
            }

        } else{

            //copy into container
            if(j == 0){
                if(line[i] == ' '){
                    i++;
                    continue;
                }
            }
            container[j] = line[i];
            j++;
        }
        i++;
    }
    i = 0;

    return result;
}

void beginSort(char* selectedColumn, char* fileName){

    printf("%s\n", "Begin sort called");

    FILE* fp;

    fp = fopen(fileName, "r");

        row titleRow;
        int sortedColumnNum = 1;
        char *token;

        // sets up the row of column titles

        titleRow.rowValue = (char*) malloc (sizeof(char) * 1000);



        printf("%s\n", "Program stops here when on forked process");
        printf("I failed. This is my id, %d, and this is my parents id %d\n", getpid(), getppid());
        fgets(titleRow.rowValue, 999, fp);
        printf("%s\n", "Got passed fgets");


        titleRow.rowLength = strlen(titleRow.rowValue);
        titleRow.fields = (char**) malloc(sizeof(char *) * titleRow.rowLength);



        token = strtok(titleRow.rowValue, ",");
        titleRow.fields[0] = token;

        //Beginning splitting the tokens and check if the column name entered exists
        int selectedColumnExist = 0;



        while((token = strtok(NULL, ","))){



            titleRow.fields[sortedColumnNum] = token;

            //This removes the last whitespace value (\n) because for the last column in the CSV, the check would fail
            titleRow.fields[sortedColumnNum] = removeWhitespace(titleRow.fields[sortedColumnNum], strlen(titleRow.fields[sortedColumnNum]) - 1);


            if (strcmp(titleRow.fields[sortedColumnNum], selectedColumn) == 0){
                //the column exists
                selectedColumnExist = 1;

            }

            sortedColumnNum++;
        }

        if (selectedColumnExist != 1){
            printf("Sorry, the column you entered doesn't exist in the csv\n");
            return;
        }

        titleRow.sortedColumnNum = sortedColumnNum;

        int length = strlen(titleRow.fields[sortedColumnNum - 1]);
        if(titleRow.fields[sortedColumnNum - 1][length - 1] == '\n'){
            titleRow.fields[sortedColumnNum - 1][length - 2] = '\0';
        }

        // trim column titles
        int i = 0;
        while(i < sortedColumnNum){
            titleRow.fields[i] = removeWhitespace(titleRow.fields[i], strlen(titleRow.fields[i]) - 1);
            i++;
        }

        row *data;
        int numberOfRows;
        data = (row*) malloc (sizeof(row) * 15000); //size matters


        // non title rows, aka all the other ones
        row regularRow;
        regularRow.rowValue = (char*) malloc (sizeof(char) * 1000);
        int currentRow = 0;

        while(fgets(regularRow.rowValue, 999, fp) != NULL){
            regularRow.rowLength = strlen(regularRow.rowValue);
            regularRow.fields = (char**) malloc(sizeof(char *) * (sortedColumnNum+1));
            regularRow.fields = customStrTok(regularRow.rowValue, sortedColumnNum);
            data[currentRow++] = regularRow;
        }

        numberOfRows = currentRow;

        int columnToSort = 0;

        while(columnToSort < titleRow.sortedColumnNum){
            if(strcmp(titleRow.fields[columnToSort], selectedColumn) == 0){
                break;
            }
            columnToSort++;
        }

        //Call mergesort

        mergeSort(data, columnToSort, numberOfRows);



        //Export to a new file

        FILE *fp2;
        char* filename2;

        //make this work
        //filename2= strcat(fileName, "-sorted-.csv");
filename2="result.csv";

        printf("%s\n", "Begin export");
        printf("%s\n", filename2);


        fp2=fopen(filename2,"w+");


        int vv,zz;
        vv = 0;

            while(vv < sortedColumnNum){

                fprintf(fp2, titleRow.fields[vv]);

                if(vv != sortedColumnNum - 1){
                    fprintf(fp2, ",");
                }else{
                    fprintf(fp2, "\n");
                }

                vv++;
            }

            vv = 0;
            zz = 0;

            while(vv < numberOfRows){

                while(zz < sortedColumnNum){

                    fprintf(fp2, data[vv].fields[zz]);

                    if(zz != sortedColumnNum - 1){
                        fprintf(fp2, ",");
                    }else{
                        fprintf(fp2, "\n");
                    }

                    zz++;
                }
                vv++;
                zz = 0;
            }

            fclose(fp2);


}

void traverseDirectory(char* dirName, char* selectedColumn){

    DIR *dir;
    struct dirent *ent;
    if ((dir = opendir (dirName)) != NULL) {
      /* print all the files and directories within directory */
      while ((ent = readdir (dir)) != NULL) {
          char* itemName = ent->d_name;
          int length = strlen(itemName);
         int pid;


         //CSV FILE FOUND
          if (length > 0 && itemName[length - 1] == 'v'
                  && itemName[length - 2] == 's'
                && itemName[length - 3] == 'c'
                && itemName[length - 4] == '.' )
{
              //confirm if valid csv file (opens correctly and has valid headers)
                            pid = fork();
                            printf("%d\n", pid);
                            printf("%s\n", "CSV found");


}


                        switch(pid){
                        case 0:

                        beginSort(itemName, selectedColumn);
                            break;


                        case -1:
                            printf("%s\n", "Error creating fork");

                        default:
                            continue;
                        }





                        return;



         // printf("%d\n", strlen(itemName));



        //printf ("%s\n", itemName);
      }
      closedir (dir);
    } else {
      /* could not open directory */
      perror ("");
      return;
    }

    //CODE TO OPEN DIRECTORY GIVEN GOES HERE
    /*
     * DIR* test = opendir(dirName);
     * dirent* newfile = readdir(test);
     *
     */


    /*for (loop to iterate through everything in given dirName){
         *

         if (DIRECTORY){

             pid = fork();

             switch(pid){
             case 0:
                traverseDirectory(directoryName);


             case -1:
             error

             default:
             continue;

         }

        if (FILE){
            if (currentFile == CSV && currentFile == validCSV file){

                pid == fork();

            }

            switch(pid){
            case 0:
            beginSort(currentfile, selectedColumn)
                break;


            case -1:
                error

            default:
                continue;
            }



            }

            return;
        }*/


}

int main (int argc, char* argv[]){

    // column to sort by
    char* selectedColumn;

    if (strcmp(argv[1],"-c") != 0){
                printf("Sorry, you must use the -c flag to declare a column\n");
                return 1;
        }

    if (argc != 3 && argc != 5 && argc != 7){
        printf("Invalid argument size\n");
                    return 1;
    }

    selectedColumn = argv[2];


    //beginSort(selectedColumn, "movie_metadata.csv");

    if (argc == 3){
        traverseDirectory("./", selectedColumn);
    }



    if (argc == 5){
        //check if argv[3] == -d
        //{ do something }

        //check if arv[3] == -o
        //{ do something }
    }

    if (argc == 7){
        //check if argv[3] == -d && argv[5] == -o
        //
    }



    return 0;
}

Anyone have any idea? 有人知道吗 The GDB seems to report the seg fault way before the line that the program actually fails on so I'm torn here. GDB似乎在程序实际失败的那一行之前报告了段错误的方式,所以我在这里被撕裂了。

So your code looks a bit like this...I've removed the lines that aren't related to the problem. 所以您的代码看起来像这样...我删除了与该问题无关的行。

fp = fopen(fileName, "r");
// ....
fgets(titleRow.rowValue, 999, fp);

What happens in your code if filename doesn't exist? 如果filename不存在,代码中会发生什么? You should always check the return value of functions to make sure they've worked. 您应该始终检查函数的返回值,以确保它们已经起作用。

You're making the assumption that because traverseDirectory found the file, that you can open it. 您假设由于traverseDirectory找到了文件,因此可以打开它。 But you're forgetting that itemName is relative to dirName . 但是您忘记了itemName是相对于dirName You need to combine the two, to get the full filename for you to be able to open it. 您需要将两者结合起来,以获得完整的文件名才能打开它。

There is also a problem with how you identify a CSV file. 您如何识别CSV文件也存在问题。

if (length > 0 && itemName[length - 1] == 'v'
    && itemName[length - 2] == 's'
    && itemName[length - 3] == 'c'
    && itemName[length - 4] == '.' )

If length is 3 (for example), you'll end up accessing itemName[-1] which isn't right. 例如,如果length为3,则最终将访问不正确的itemName[-1] You want to make sure that length is at least 5 (unless a file called ".csv" is valid?) and you could use strcmp to make it easier to see what you're doing too. 您想要确保长度至少为5(除非有一个名为“ .csv”的文件有效?),并且可以使用strcmp使其更轻松地查看正在执行的操作。

if (length > 4 && strcmp(itemName+length-4,".csv")==0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM