简体   繁体   中英

Read word from a text file

This is the requirement I must follow:

There will be a C style or C++ style string to hold the word. An int to hold a count of each word. A struct or class to hold both of these. This struct/class will be inserted into an STL list. You will also need a C style or C++ style string to hold the line of text you read from the files. You will parse this line into words as per the word definition in the assign spec.

The first part seems alright, but in the second one, I still don't get the point about reading a line then parsing it into a word. Is it more efficient than reading straight a word from text file by using?

The efficiency depends on the definition of the word (which comes from the assignment spec.): if you need to go through the linem more than once to determine where a word begins/ends (ie what belongs to a word), it is more efficient to keep the line in memory, then perform the read from disk multiple times (although the performance impact can be lessened by I/O cache).

Even if there is no performance gain, this being a homework assignment, I think you are asked to do this to learn 1) how to read strings (lines) from a file; 2) how to parse a string in memory. To achieve the two goals at once, you have this requirement

使用fstream从文件中读取每行,然后通过对空间进行划分并直到loop的行尾将其解析为单词。

Depending on your use case, it can be useful to read files line by line.

Reading the whole file in memory first and parsing it afterward do not minimize memory usage. The memory required for your program to run would be at least the size of the file. If the input file is big compared to the memory available to your program, you won't be able to allocate enough memory to store the entire file (try to allocate a string of 20GB to see what happens).

On the other hand, if you read line by line, only the size of one line is needed in memory at a time: you can release memory allocated for previous lines immediately.

So parsing line by line is useful if:

  • The input files are too big to fit entirely in memory
  • The size of each line is small enough (reading line by line does not help if the file is made of one large line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM