简体   繁体   English

如何在 C++ 程序中的 2 个特定字符之间比较 2 个文件中的文本行

[英]How do I compare text lines in 2 files between 2 specific characters in a C++ program

I have this program I am writing that cleans up a xml file and adds new lines to a settings section from a txt file.我正在编写这个程序,它清理 xml 文件并从 txt 文件向设置部分添加新行。 Part of it I have a section labeled // Part in my code.其中一部分我的代码中有一个标记为 // Part 的部分。 It is during or after that section, either is fine, I would like to compare the lines to make sure they are not duplicated but ignore their setting in this case True and False and consider them identical if one is set to true and the other set to false and only keep the second line and discard the first.在该部分期间或之后,都可以,我想比较这些行以确保它们没有重复,但在这种情况下忽略它们的设置 True 和 False 并认为它们相同,如果一个设置为 true 而另一个设置为假,只保留第二行并丢弃第一行。 Here is an example of how the settings look:以下是设置外观的示例:

    <setting1>true</setting1>
    <setting2blue>false</setting2blue>
    <setting3>true</setting3>
    <setting1>false</setting1>
    <setting4>true</setting4>
    <setting2blue>true</setting2blue>

So in the end I would like the first setting 1 to be removed and the second setting 1 to stay and same thing for setting 2. Keep in mind this is an example as the settings have different names and sometimes contain the same words.所以最后我希望删除第一个设置 1,保留第二个设置 1,设置 2 也是一样的。请记住,这是一个示例,因为设置有不同的名称,有时包含相同的单词。

I've tried to use.compare but got really lost as I am still very new to C++.我尝试过 use.compare 但真的迷路了,因为我对 C++ 还是很陌生。 I even though that I might need to do a new in stream and out stream and then compare after my previous work was done but I am still getting hung up on how to compare.即使我可能需要在 stream 和 stream 中做一个新的,然后在我之前的工作完成后进行比较,但我仍然对如何比较感到困惑。

I appreciate any help.我很感激任何帮助。

Thanks, Vendetto谢谢, Vendetto

Here is part of the program I broke out to test in without having to run the whole thing.这是我在无需运行整个程序的情况下进行测试的程序的一部分。

#include <stdio.h>
#include <fstream>
#include <sstream>
#include <iostream>
#include <string>
#include <cctype>
#include <cstdlib>
#include <set>
#include <vector>
#include <algorithm>
#include <cassert>
#include <Windows.h>
using namespace std;


bool isSpace(unsigned char c) {
    return ( c == '\r' ||
        c == '\t' || c == '\v' || c == '\f');
}


int main()
{


    const string Dir{ "C:/synergyii/config/" };
    ifstream in_config{ Dir + "clientconfig.xml" },
        in_newlines{ Dir + "newlines.txt" };
    ofstream out{ Dir +  "cltesting.txt" };


    vector<string> vlines31;
    vector<string> vlines32;
    set<string>    slines31;
    set<string>    slines32;


    for (string line31; getline(in_config, line31); vlines31.push_back(line31))
        if (line31.find("<settings>") != string::npos) {
            vlines31.push_back(line31);
            break;
        }


    for (const auto& v : vlines31)
        out << v << '\n';


    // <settings> Part
    
    for (string line32; getline(in_config, line32) && line32.find("</settings>") == string::npos; ) {
        line32.erase(remove_if(line32.begin(), line32.end(), isSpace), line32.end());
        line32.erase(line32.find_last_not_of(" ") + 1);
        const auto& result = slines32.insert(line32);
        if (result.second)
            vlines32.push_back(line32);
    }


    for (string line32; getline(in_newlines, line32);) {
        line32.erase(remove_if(line32.begin(), line32.end(), isSpace), line32.end());
        const auto& result = slines32.insert(line32);
        if (result.second)
            vlines32.push_back(line32);
    }


    vlines32.erase(unique(vlines32.begin(), vlines32.end()), vlines32.end() );


    for (auto it = vlines32.cbegin(); it != vlines32.cend(); ++it)
        out << '\t' << '\t' << *it << '\n';


    out << '\t' << "</settings>\n";
    out << "</config>\n";


    in_config.close();
    out.close();
}

A note about XML first:首先是关于 XML 的说明:

XML allows formatting which doesn't necessarily change the meaning of its contents. XML 允许格式化,但不一定会改变其内容的含义。 Beside of indentation, an element might be written in one line or spread over multiple lines.除了缩进之外,一个元素可以写在一行中,也可以分布在多行中。 It's even allowed to write the whole XML file in one line (assuming there are no newlines in the element's contents like in OPs case).甚至可以在一行中写入整个 XML 文件(假设元素内容中没有换行符,就像在 OPs 的情况下一样)。

Reading XML correctly with C++ standard I/O is more complicated than a few std::getline() s.使用 C++ 标准 I/O 正确读取 XML 比一些std::getline()更复杂。 To do it right, an XML library should be used to read the XML file into a DOM to do the intended processing.为了做到这一点,应该使用 XML 库将 XML 文件读入 DOM 以进行预期的处理。
Eg SO: What XML parser should I use in C++?例如SO:我应该在 C++ 中使用什么 XML 解析器? provides an overview about available XML libraries.概述了可用的 XML 库。


That being said, I want to demonstrate a possible solution for OPs question but using another even simpler config.话虽如此,我想演示一个可能的 OP 问题解决方案,但使用另一个更简单的配置。 format – key value pairs separated by a colon ( : ).格式 – 以冒号 ( : ) 分隔的键值对。

How to filter out duplicated keys:如何过滤掉重复的键:

The solution is actually simple:解决方案其实很简单:
The whole file is read line by line into a vector of string s.整个文件被逐行读入string s 的vector中。
If a line contains a key the key is stored in a look-up table.如果一行包含一个键,则该键存储在查找表中。
If the key was already in that look-up table the previous occurrence (line) is remarked as invalid.如果该键已经在该查找表中,则先前的出现(行)被标记为无效。 To keep it simple, I just clear the line.为了简单起见,我只是清除了界限。 If empty lines may be valid contents (which shall be kept in file) something else should be used to remark the line eg an extra bool stored with each line.如果空行可能是有效内容(应保存在文件中),则应使用其他内容来标记该行,例如每行存储一个额外的bool
I didn't consider removal of lines as an option because this would invalidate the stored line indices of all keys for following lines (or I had to iterate through the look-up table to fix them).我没有考虑将删除行作为一种选择,因为这会使存储的所有键的行索引无效(或者我必须遍历查找表来修复它们)。

Demo:演示:

#include <iostream>
#include <map>
#include <sstream>
#include <string>
#include <vector>

std::vector<std::string> lines;

using LookUpTable = std::map<std::string, size_t>;

LookUpTable lut;

std::istream& readLine(std::istream &in)
{
  std::string line; if (!std::getline(in, line)) return in;
  const size_t iLine = lines.size();
  // extract key
  const size_t i = line.find(':');
  if (i < line.size()) { // Has the line a key at all?
    std::string key = line.substr(0, i);
    // look whether there was already this setting
    const LookUpTable::iterator iter = lut.find(key);
    if (iter != lut.end()) { // Was it already there?
      // clear previous line
      lines[iter->second].clear();
    }
    // store key and line index
    lut.emplace(std::move(key), iLine);
  }
  // store line in lines buffer
  lines.push_back(std::move(line));
  // done
  return in;
}

void readFile(std::istream &in)
{
  while (readLine(in));
}

void writeFile(std::ostream &out)
{
  for (const std::string line : lines) {
    // skip empty lines
    if (line.empty()) continue;
    // write non-empty lines
    out << line << '\n';
  }
}

int main()
{
  std::string sample = R"(# sample config file
setting1: true
setting2blue: false
setting3: true
setting1: false
setting4: true
setting2blue: true
)";
  // read the sample
  { std::istringstream in(sample);
    readFile(in);
  }
  // write the sample (with clean-up)
  std::cout << "Output:\n";
  writeFile(std::cout);
}

Output: Output:

Config.:
# sample config file
setting3: true
setting1: false
setting4: true
setting2blue: true

Live Demo on colirucoliru 现场演示

Nit-picking:挑剔:

An unordered map may provide a possible even-faster look up than a map.无序 map 可能提供比 map 更快的查找。 It may pay for this with a possible higher memory foot-print.它可能会为此付出更高的 memory 足迹。 I doubt that this difference is essential for the task but with a minimal change, it works with an unordered_map as well:我怀疑这种差异对于任务来说是必不可少的,但只需进行最小的更改,它也可以与unordered_map一起使用:

Live Demo on colirucoliru 现场演示

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何比较 c++ 中的两个文本文件 - How do i compare two text files in c++ 如何将c ++程序拆分为包含普通代码行(不一定是函数)的文件? - How do I split my c++ program into files that contain plain code lines, not necessarily functions? 如何简单地比较C ++中的字符? - How do I simply compare characters in C++? 如何从C ++中的文本文件中获取特定行? - How do I get specific lines from a text file in C++? 如何在C / C ++中的Windows下对两个Unicode字符或字符串进行不区分大小写的比较? - How do I make a case insensitive compare between two Unicode characters or strings under Windows in C/C++? 在C ++中的特定文本行之间读取文本文件 - Read text file between specific lines of text in c++ 如何用c ++编写程序以解决无法事先知道从其读取的数据文本文件中包含多少行的问题? - How do I write a program in c++ to account for not knowing in advance how many lines are included in a data text file it is reading from? 如何在 C++ 中反转文本文件中的字符顺序? - How do I reverse the order of characters in a text file in C++? 我有2个文本文件,output文本文件和输入文本文件,我怎么append输入文本文件在output文本文件开头的行(c++)? - I have 2 text files , output text file and input text file , how do I append the lines of input text file at the beginning of output text file(c++)? C ++比较2个不同文本文件之间的单词 - C++ Compare words between 2 different text files
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM