简体   繁体   中英

How to save text file to struct with string in C++

I'm wanting to save the content of a file to a struct. I've tried to use seekg and read to write to it but it isn't working.

My file is something like:

johnmayer24ericclapton32

I want to store the name, the last name and the age in a struct like that

typedef struct test_struct{
    string name;
    string last_name;
    int age;
} test_struct;

Here is my code

int main(){

    test_struct ts;
    ifstream data_base;

    data_base.open("test_file.txt");

    data_base.seekg(0, ios_base::beg);
    data_base.read(ts, sizeof(test_struct));

    data_base.close();

    return 0;
}

It doesn't compile as it don't want me to use ts on the read function. Is there another way - or a way - of doing it?

You'll have to develop a specific algorithm, since there is no separator character between the "fields".

static const std::string input_text = "johnmayer24ericclapton32";
static const std::string alphabet = "abcdefghijklmnopqrstuvwxyz";
static const std::string decimal_digit = "0123456789";

std::string::size_type position = 0;
std::string            artist_name;
position = input_text.find_first_not_of(alphabet);
if (position != std::string::npos)
{
   artist_name = input_text.substr(0, position - 1);
}
else
{
   cerr << "Artist name not found.";
   return EXIT_FAILURE;
}

Similarly, you can extract out the number, then use std::stoi to convert the numeric string to internal representation number.

Edit 1: Splitting the name
Since there is no separator character between the first and last name, you may want to have a list of possible first names and use that to find out where the first name ends and the surname starts.

Serialization/Deserialization of strings is tricky.

As binary data the convention is to output the length of the string first, then the string data.

https://isocpp.org/wiki/faq/serialization#serialize-binary-format

  • String data is tricky because you have to unambiguously know when the string's body stops. You can't unambiguously terminate all strings with a '\\0' if some string might contain that character; recall that std::string can store '\\0'. The easiest solution is to write the integer length just before the string data. Make sure the integer length is written in “network format” to avoid sizeof and endian problems (see the solutions in earlier bullets).

That way when reading the data back in you know the length of the string to expect and can preallocate the size of the string then just read that much data from the stream.

If your data is a non-binary (text) format it's a little trickier:

https://isocpp.org/wiki/faq/serialization#serialize-text-format

  • String data is tricky because you have to unambiguously know when the string's body stops. You can't unambiguously terminate all strings with a '\\n' or '"' or even '\\0' if some string might contain those characters. You might want to use C++ source-code escape-sequences, eg, writing '\\' followed by 'n' when you see a newline, etc. After this transformation, you can either make strings go until end-of-line (meaning they are deliminated by '\\n') or you can delimit them with '"'.
  • If you use C++-like escape-sequences for your string data, be sure to always use the same number of hex digits after '\\x' and '\\u\u0026#39;. I typically use 2 and 4 digits respectively. Reason: if you write a smaller number of hex digits, eg, if you simply use stream << "\\x" << hex << unsigned(theChar), you'll get errors when the next character in the string happens to be a hex digit. Eg, if the string contains '\\xF' followed by 'A', you should write "\\x0FA", not "\\xFA".
  • If you don't use some sort of escape sequence for characters like '\\n', be careful that the operating system doesn't mess up your string data. In particular, if you open a std::fstream without std::ios::binary, some operating systems translate end-of-line characters. Another approach for string data is to prefix the string's data with an integer length, eg, to write "now is the time" as 15:now is the time. Note that this can make it hard for people to read/write the file, since the value just after that might not have a visible separator, but you still might find it useful.

Text-based serialization/deserialization convention varies but one field per line is an accepted practice.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM