简体   繁体   中英

Most efficient way to match two text files?

Let's say I have two text files. The text file "A.txt" contains names and their age. The text file "B.txt" contains names and their weight. The text files have different order of names.

//text file "A.txt"
Jason    20
Jack     34
Amanda   15
Einstein 65
Kelvin   47
//text file "B.txt"
Einstein 70
Amanda   55
Jack     99
Kelvin   85
Jason    68

What is the most efficient way with the least operations to read and match these 2 text files and set their attributes into an array of object of a class?

class Person{
    private:
        string name;
        int age;
        int weight;
    public:
        //setter method
}

int main(){
    Person haha[5];
    //code to read files and stores into haha

}

You want to usestd::unordered_map .

All in all your algorithm will be like this:

  1. Line by line read the first text file and insert them into std::unordered_map instance. Where the key of the map will be some string unique for each person, like their name, for example. And the Person object is a value. The time complexity for inserting into std::unordered_map will be O(1) .
  2. Line by line read the next file. And try to find Person in the std::unordered_map with the same name with std::unordered_map::find . If it is in the map, then it's a duplicate else you insert it in the std::unordered_map again. The time complexity for std::unordered_map::find will be O(1) .

If you want to have an array of Person objects then you can make one and move objects there. But you can also iterate through the std::unordered_map too, so it can be used instead of the array of Person s.

You can come up with elegant solutions, if you use modern C++ language elements.

The secret for matching things, is to use associative containers, like std::map . In this you can store a key (eg the name) and a value (eg the weight) and look this up very fast.

And, I made a design decision: A person can have only a weight, if he has an age. This means, if the file with the weights contain more entries than the file with the ages, I will ignore these additional persons.

In order to add elements to your Person class, I added a constructor, that simply copies the values to the member variables. The weight is optional.

And, to be able to show some nice output, I have overwritten the inserter operator for the Person class.

Both source text files contain a std::string and an integer . For the usage of an easier reading algorithm, I created a proxy class to read a std::pair<std::string, int> .

For reading the weights file, we will simply define a map and then use its range constructor, to read all values via the std::istream_iterator in conjunction with the defined Proxy class for our string-int-pairs. Please note that the end-iterator will be default constructed via {}. And, we use the C++17 feature CTAD ("class template argument deduction"), to define the std::map without template arguments.

For matching the values, we use std::transform that will read the ages file, search for the matching weight and add the result to our persons vector.

Last, but not least, we show the result on the display.

Please see a full working example:

#include <iostream>
#include <fstream>
#include <vector>
#include <map>
#include <string>
#include <utility>
#include <iterator>
#include <algorithm>

// Define a proxy class, to read a pair from a stream
struct StringIntegerPair : public std::pair<std::string, int> {
    friend std::istream& operator >> (std::istream& is, StringIntegerPair& sip) {
        return is >> sip.first >> sip.second;
    }
};

// Our test class
class Person {
private:
    std::string name{};
    int age{};      // 0 means: not set
    int weight{};   // 0 means: not set
public:
    //setter method

    // Constructor
    Person(const std::string& n, const int a, const int w = 0) : name(n), age(a), weight(w) {}
    // Output
    friend std::ostream& operator << (std::ostream& os, const Person& p) {
        return os << "Name: " << p.name << "\tAge: " << p.age << "\tWeight: " << p.weight;
    }
};

int main() {

    // Open file with weights and check, if it could be opened
    if (std::ifstream weightStream("r:\\b.txt"); weightStream) {

        // Open file with ages and check, if it could be opened
        if (std::ifstream ageStream("r:\\a.txt"); ageStream) {

            // Read all weights into a map
            std::map namesAndWeights(std::istream_iterator<StringIntegerPair>(weightStream), {});

            // Here we will store all persons
            std::vector<Person> persons{};

            // Now read all persons with age, check, if a weight is existing and add it to the vector
            std::transform(std::istream_iterator<StringIntegerPair>(ageStream), {}, std::back_inserter(persons),
                [&namesAndWeights](const std::pair<std::string, int>& na) { return Person(na.first, na.second, namesAndWeights[na.first]); });

            // Show result to user
            std::copy(persons.begin(), persons.end(), std::ostream_iterator<Person>(std::cout, "\n"));
        }
        else {
            std::cerr << "\n*** Error: Could not open file with ages\n";
        }
    }
    else {
        std::cerr << "\n*** Error: Could not open file with weights\n";
    }
    return 0;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM