简体   繁体   中英

Multi-threaded file reading produces the same result for each thread

Basically, the issue I am having is in the title, I am trying to create a multi-threaded application to read and sum up the contents of a file, this works correctly with one thread. However, when more are introduced they come out with the same output. How do I fix this?

The code

void *sumThread(void *);
pthread_mutex_t keepOut = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t keepOutSum = PTHREAD_MUTEX_INITIALIZER;
int counter = 0, line_count = 0;
char* loc;
double total = 0;

void split(const string& s, char c, vector<string>& v)
{
    string::size_type i = 0;
    string::size_type j = s.find(c);

    while (j != string::npos)
    {
        v.push_back(s.substr(i, j - i));
        i = ++j;
        j = s.find(c, j);

        if (j == string::npos)
            v.push_back(s.substr(i, s.length()));
    }
}

int main(int argc, char* argv[])
{

    if (argc < 2)
    {

        cerr << "Usage: " << argv[0] << " filename" << endl;
        return 1;
    }

    string line;
    loc = argv[1];
    ifstream myfile(argv[1]);
    myfile.unsetf(ios_base::skipws);

    line_count = std::count(std::istream_iterator<char>(myfile),
                            std::istream_iterator<char>(),
                            '\n');

    myfile.clear();
    myfile.seekg(-1, ios::end);
    char lastChar;
    myfile.get(lastChar);
    if (lastChar != '\r' && lastChar != '\n')
        line_count++;

    myfile.setf(ios_base::skipws);
    myfile.clear();
    myfile.seekg(0, ios::beg);

    pthread_t thread_id[NTHREADS];

    for (int i = 0; i < NTHREADS; ++i)
    {
        pthread_create(&thread_id[i], NULL, sumThread, NULL);
    }

    for (int i = 0; i < NTHREADS; ++i)
    {
        pthread_join(thread_id[i], NULL);
    }

    cout << setprecision(2) << fixed << total << endl;
    return 0;
}

void *sumThread(void *)
{

    pthread_mutex_lock(&keepOut);
    int threadNo = counter;
    counter++;
    pthread_mutex_unlock(&keepOut);

    ifstream myfile(loc);
    double runningTotal = 0;
    string line;

    if (myfile.is_open())
    {
        for (int i = threadNo; i < line_count; i += NTHREADS)
        {
            vector < string > parts;

            getline(myfile, line);
            // ... and process out the 4th element in the CSV.
            split(line, ',', parts);

            if (parts.size() != 3)
            {
                cerr << "Unable to process line " << i
                        << ", line is malformed. " << parts.size()
                        << " parts found." << endl;
                continue;
            }

            // Add this value to the account running total.
            runningTotal += atof(parts[2].c_str());
        }
        myfile.close();
    }
    else
    {
        cerr << "Unable to open file";
    }

    pthread_mutex_lock(&keepOutSum);

    cout << threadNo << ":  " << runningTotal << endl;
    total += runningTotal;
    pthread_mutex_unlock(&keepOutSum);
    pthread_exit (NULL);
}

Sample output

 2:  -46772.4
 0:  -46772.4
 1:  -46772.4
 3:  -46772.4
 -187089.72

Each thread is supposed to read and sum up the numbers in the file, then add them together when it's done. However, the threads all seem to return the same number even though the threadNo variable a clearly different as indicated in the output.

Your problem is here:

for (int i = threadNo; i < line_count; i += NTHREADS) {
    vector<string> parts;

    getline(myfile, line);

getline() doesn't know the value of i , so it is still reading adjacent lines from the file, without skipping any lines. Hence all threads are reading the same first few lines of the file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM