简体   繁体   English

使用2个字符串定界符C选择文本文件的一部分

[英]choose portion of text file using 2 string delimiters c++

I've a little problem with split of text file; 我在分割文本文件时遇到了一些问题; in my text file there are almost 10 thousand of recipies like 在我的文本文件中,差不多有1万条

-Ing_principal -Ing_principal

ingr 1 原料1

-Ingredients -配料

ingr 1 原料1

ingr 2 原料2

ingr 3 原料3

-Preparation -制备

now how I can get only ingredients between 2 delimiters that are ingredients and preparation. 现在,我如何只获取2个分隔符之间的成分,即成分和准备。

So I think this solution 所以我认为这个解决方案

int main() {
string s, t;
bool i = false;
ifstream ricette;
ofstream ingredienti;
ingredienti.open("ingredienti.txt");
ricette.open("ricette.txt", ios::out);
while(ricette) {        
    getline (ricette, s);
    if (s[0] == '-' && s[1] == 'I' && s[5] != 'P') {
        i = true;
        getline(ricette, t);
            while (i) {
                if (t[0] != '-' && t[1] != 'P')
                    cout <<  t << endl;
                else i = false; 

        }
    }
}
ingredienti.close();
ingredienti.close();  }

but this return only ingr 1 in infinite loop. 但这只会在无限循环中返回ingr 1。 Anyone have good solution or suggestion? 任何人都有好的解决方案或建议?

It seems you don't read new input lines in this loop: 看来您没有在此循环中读取新的输入行:

        while (i) {
            if (t[0] != '-' && t[1] != 'P')
                cout <<  t << endl;
            else i = false; 

            // Here you'll need to read the next line
        }

This line also looks strange: 这行看起来也很奇怪:

if (s[0] == '-' && s[1] == 'I' && s[5] != 'P') {

I guess it shall be a 'p' instead of 'P': 我猜应该是“ p”而不是“ P”:

if (s[0] == '-' && s[1] == 'I' && s[5] != 'p') {

BTW - you close the same file twice: 顺便说一句-您两次关闭同一文件:

ingredienti.close();
ingredienti.close();

However, I would use another approach to avoid two while-statements. 但是,我将使用另一种方法来避免两个while语句。 Something like: 就像是:

int main() {
    string s;
    bool foundInterestingSection = false;
    ifstream ricette("ricette.txt");
    ofstream ingredienti("ingredienti.txt");

    while(getline (ricette, s))
    {
        if (foundInterestingSection)
        {
            if (s == "-Preparation")
            {
                // The interesting section ends now
                foundInterestingSection = false;
            }
            else
            {
                cout <<  s << endl;

                // Write to output file
                ingredienti << s << endl;
            }
        }
        else
        {
            if (s == "-Ingredients")
            {
                // The interesting section starts now
                foundInterestingSection = true;
            }
        }
    }
    ingredienti.close();
    ricette.close();
}

You want to access a portion that is delimited by two delimiters. 您要访问由两个定界符分隔的部分。 Then the straightforward solution is to search for those two delimiters. 那么直接的解决方案是搜索这两个定界符。 You can then copy the intermediate contents for further use. 然后,您可以复制中间内容以备将来使用。

The approach I used buffers first the whole input from std::cin , because it doesnt't support arbitrary moving around in the input. 我使用的方法首先缓冲来自std::cin的整个输入,因为它不支持输入中的任意移动。 When using a file, this is most likely not necessary. 使用文件时,这很有可能不是必需的。

To perform searches, the best solution is std::search from <algorithm> , you can use it to find the first occurrence of a sequence inside of another one. 要执行搜索,最好的解决方案是<algorithm> std::search ,您可以使用它来查找另一个序列中第一次出现的序列。 In your case, this is finding "-Ingredients" or "-Preparation" inside of the file. 在您的情况下,这是在文件内部找到"-Ingredients""-Preparation"

std::string const start_delimiter{"-Ingredients"};
auto start = std::search(from, to, start_delimiter.begin(), start_delimiter.end());
// start now points to '-', assuming the string was found
std::advance(start, delimiter.size());
// start now points delimiter.size() characters AFTER the '-', which
// is the character following the delimiter string
// ...
std::string const end_delimiter{"-Preparation"};
auto end = std::search(start, to, end_delimiter.begin(), end_delimiter.end());
// Your text is between [start,end)
from = end;
std::advance(from, end_delimiter.size());

You use this to find the both delimiters, then you can use the part in between the respective iterators to extract / print / work with the text you're interested in. Note that you might need to add newline characters to your delimiters as needed. 您可以使用它找到两个定界符,然后可以在各个迭代器之间使用该部分来提取/打印/处理您感兴趣的文本。请注意,您可能需要根据需要在定界符中添加换行符。

I put together a small example , though you might want to factor the reading into some function, either returning the respective text parts, or taking a functor to work on each of the text part. 我整理了一个小示例 ,尽管您可能希望将读数纳入某个函数中,或者返回相应的文本部分,或者使用函子来处理每个文本部分。


Concerning your code, there are multiple issues: 关于您的代码,存在多个问题:

ifstream ricette;
// ...
ricette.open("ricette.txt", ios::out);
// ...
getline(ricette, t);

You take an input file stream, open it for output , then read from it? 您获取输入文件流,将其打开以进行输出 ,然后从中读取

  getline(ricette, t);
  while (i) {
            // ...
  }

You only read one line of the ingredients. 您只读了一行成分。 You need to perform reading inside of your loop, otherwise t will never change inside of that while loop (which is why you get an infinite loop). 您需要在循环内部执行读取,否则t永远不会在while循环内部改变(这就是为什么您会得到无限循环的原因)。

ingredienti.close();
ingredienti.close();

... double close ... ...双关...

Then, in general, you should directly test the input operations, ie the getline : 然后,通常,您应该直接测试输入操作,即getline

std::string t; // Use better names, define variables near their use
while(getline(ricette, t)) {
  if (t[0] == '-' && t[1] == 'P') {
   break;
  }
}
// could be eof/failure OR "-P.." found

Then, seeing your test, think of what happens when you input an empty line? 然后,看看您的测试,想一想当您输入空行时会发生什么? Or a line with only a single character? 还是只有一个字符的一行? You need to test for the size, too: 您还需要测试尺寸:

if (t.size() > 1 && t[0] == '-' && t[1] == 'P')

And finally, your code assumes different things than what you told us. 最后,您的代码假设的内容与您告诉我们的内容不同。 (Your delimiters are "-I" followed by a "not p" test as well as "-P") (您的定界符是“ -I”,然后是“ not p”测试以及“ -P”)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM