简体   繁体   English

比较两个文本文件并减去相应的值

[英]Comparing two text files and subtracting respective values

I have 2 text files default.txt and current.txt. 我有2个文本文件default.txt和current.txt。

default.txt: default.txt:

ab_abcdefghi_EnInP005M3TSub.csv FMR: 0.0009 FNMR: 0.023809524 SCORE: -4  Conformity: True
ab_abcdefghi_EnInP025M3TSub.csv FMR: 0.0039 FNMR: 0 SCORE: -14  Conformity: True
ab_abcdefghi_EnInP050M3TSub.csv FMR: 0.01989 FNMR: 0 SCORE: -18  Conformity: True
ab_abcdefghi_EnInP075M3TSub.csv FMR: 0.0029 FNMR: 0 SCORE: -17  Conformity: True
ab_abcdefghi_EnInP090M3TSub.csv FMR: 0.0002 FNMR: 0 SCORE: -7  Conformity: True

current.txt looks like this current.txt看起来像这样

ab_abcdefghi_EnUsP005M3TSub.csv FMR: 0.0041 FNMR: 0 SCORE: -14  Conformity: True
ab_abcdefghi_EnUsP025M3TSub.csv FMR: 0.00710000000000001 FNMR: 0 SCORE: -14  Conformity: True
ab_abcdefghi_EnUsP050M3TSub.csv FMR: 0.0287999999999999 FNMR: 0 SCORE: -21  Conformity: True
ab_abcdefghi_EnUsP090M3TSub.csv FMR: 0.0113 FNMR: 0 SCORE: -23  Conformity: True

What i need to do is to subtract values of current from default (default-current). 我需要做的是从默认值(默认电流)中减去当前值。

Eg: 例如:

FMR_DIFF = FMR(default) - FMR(test)
FNMR_DIFF = FNMR(default) - FNMR(test)
SCORE_DIFF = SCORE(default) - SCORE(test)

I need to output this in a text file with output looking something like this 我需要在一个文本文件中输出它,输出看起来像这样

O/P: O / P:

result:   005M3TSub FMR_DIFF: -0.0032 FNMR_DIFF: 0.023809524 SCORE_DIFF: 10

I am trying to do this in C#. 我想用C#做这个。 So far i have tried reading lines in both files. 到目前为止,我已尝试在两个文件中读取行。 I was able to compare them. 我能够比较它们。 I cannot comprehend the logic i need to implement. 我无法理解我需要实现的逻辑。 I am very new to programming. 我是编程新手。 Any help is appreciated. 任何帮助表示赞赏。

In order to compare the values, you'll first have to parse them. 为了比较这些值,您首先必须解析它们。 You can create a class that represents a single line of (False / Non-False) MatchRates: 您可以创建一个表示单行(False / Non-False)MatchRates的类:

public class MatchRateLine
{
    public int LineNumber { get; set; }

    public decimal FMR { get; set; }
    public decimal FNMR { get; set; }
    public int Score { get; set; }
    public bool Conformity { get; set; }
}

Then in your parser you can have a method like this: 然后在您的解析器中,您可以使用如下方法:

public List<MatchRateLine> ParseFile(string filename)
{
    var result = new List<MatchRateLine>();

    using (var reader = new StreamReader(filename))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            result.Add(ParseLine(line));
        }
    }

    return result;
}

And one way to do the actual parsing is this: 实现解析的一种方法是:

public MatchRateLine ParseLine(string line)
{
    var result = new MatchRateLine();

    int fmrPosition = line.IndexOf("FMR: ");
    int fmnrPosition = line.IndexOf("FMNR: ");

    string fmrValueString = line.Substring(fmrPosition, fmnrPosition - fmrPosition);
    decimal fmrValue;
    if (decimal.TryParse(fmrValueString, out fmrValue))
    {
        result.FMR = fmrValue;
    }

    // repeat for other values

    return result;
}

In the parser I have defined "A line's FMR value" being "the text between 'FMR: ' and 'FMNR: ', parsed as decimal" . 在解析器中,我将“A line的FMR值”定义为“FMR:'和'FMNR:'之间的文本,解析为十进制” You'll have to apply this logic for each value you want to extract. 您必须为要提取的每个值应用此逻辑。

Now when you have two collections, you can loop over them and compare their values and whatnot: 现在当你有两个集合时,你可以循环它们并比较它们的值和诸如此类的东西:

var defaultLines = Parser.ParseFile("default.txt");
var currentLines = Parser.ParseFile("current.txt");

Your actual question though seems to be that you probably want to compare specific lines in default and current , but you're having trouble identifying lines that belong to each other. 您的实际问题似乎是您可能想要比较defaultcurrent特定行,但是您无法识别彼此属于的行。 As seen with ab_abcdefghi_EnInP090M3TSub in your default on line 5, and ab_abcdefghi_EnUsP090M3TSub in current it's on line 4 (note In/Us ). 正如看到ab_abcdefghi_EnInP090M3TSubdefault在第5行,并ab_abcdefghi_EnUsP090M3TSubcurrent是第4行(注意In/Us )。

For this you can extend the MatchRateLine class with a property, where you store the filename or a substring thereof by its meaning so you can find unique lines in both lists by this value. 为此,您可以使用属性扩展MatchRateLine类,在该属性中存储文件名或其子串的含义,以便您可以通过此值在两个列表中找到唯一的行。

You can again use the Substring() method for this, in the ParseLine() method: 您可以在ParseLine()方法中再次使用Substring() ParseLine()方法:

// Position:  0123456789012345678901234567890
// Filename: "ab_abcdefghi_EnInP090M3TSub.csv"

result.ReportCode = line.Substring(17, 6);

This will cause the resulting MatchRateLine to have a ReportCode property with the value P090M3 . 这将导致生成的MatchRateLine具有值为P090M3ReportCode属性。

Given the two lists of lines again: 再给出两个行列表:

var p090m3DefaultLine = defaultLines.First(l => l.ReportCode == "P090M3");
var p090m3CurrentLine = currentLines.First(l => l.ReportCode == "P090M3");

var fmrDiff = p090m3DefaultLine.FMR - p090m3CurrentLine.FMR;

Please note this code does a lot of assumptions on the format and can throw exceptions when the line being parsed doesn't match that format. 请注意,此代码对格式做了很多假设,并且当被解析的行与该格式不匹配时可能会抛出异常。

It is an interesting problem. 这是一个有趣的问题。 Please check the solution. 请检查解决方案。 It is not optimized properly. 它未正确优化。

Firstly, we create a simple FileStructure Class to represent the String: 首先,我们创建一个简单的FileStructure类来表示String:

public class DefaultFileStructure
    {
        public string FileId;
        public decimal FMR;
        public decimal FNMR;
        public int Score;
        public bool Conformity;
    }

define the constant keyname for parsing . 定义用于解析的常量键名。

private static string DEFAULT_KN = "tv_rocscores_DeDeP";
private static string TEST_KN    = "tv_rocscores_FrFrP";

Now, Parse the file and store the data in list structure. 现在,解析文件并将数据存储在列表结构中。

    private List<DefaultFileStructure> GetFileStructure(string filePath, string keyName)
            {
                List<DefaultFileStructure> _defaultFileStructure = new List<DefaultFileStructure>();

                if(!File.Exists(filePath))
                {
                    Console.WriteLine("Error in loading the file");               
                }else{
                    string[] readText = File.ReadAllLines(filePath);
                    foreach (string s in readText)
                    {
                        _defaultFileStructure.Add(ParseLine(s, keyName));                    
                    }
                }

                return _defaultFileStructure;
            }

private DefaultFileStructure ParseLine(string Line, string Keyname)
        {
            DefaultFileStructure _dFileStruc = new DefaultFileStructure();

            string[] groups = Line.Split(new[] { ' ', ' ' },StringSplitOptions.RemoveEmptyEntries);

            /* -- Format Strucure, if the log provide same format always..
               Can also implement Expando concepts of C# 5.0 ***
                0[tv_rocscores_DeDeP005M3TSub.csv]
                1[FMR:]
                2[0.0009]
                3[FNMR:]
                4[0.023809524]
                5[SCORE:]
                6[-4]
                7[Conformity:]
                8[True]
             */

            _dFileStruc.FileId = groups[0].Replace(Keyname, "");
            _dFileStruc.FMR = decimal.Parse(groups[2]);
            _dFileStruc.FNMR = decimal.Parse(groups[4]);
            _dFileStruc.Score = int.Parse(groups[6]);
            _dFileStruc.Conformity = bool.Parse(groups[8]);

            return _dFileStruc;
        }

To match the difference and get the defined result as per your question. 根据您的问题匹配差异并获得定义的结果。

 public void getDiff(String FirstFile, string SecondFile, string ResultFile)
        {
            try
            {
                //check if file exits....
                if (!File.Exists(FirstFile)) { return; }
                if (!File.Exists(SecondFile)) { return; }

                //Keep the result String..
                StringBuilder ResultBuilder = new StringBuilder();

                //Get the List of default file.
                List<DefaultFileStructure> DefaultList = GetFileStructure(FirstFile, DEFAULT_KN);

                //Get the List of test file.
                List<DefaultFileStructure> TestList = GetFileStructure(SecondFile, TEST_KN);


                //Get the diff and save in StringBuilder.
                foreach (DefaultFileStructure defFile in DefaultList)
                {
                    bool checkALL = false;
                    foreach (DefaultFileStructure testFile in TestList)
                    {
                        //Compare the file for diff.
                        if (defFile.FileId == testFile.FileId)
                        {
                            checkALL = false;
                            ResultBuilder.AppendLine(String.Format("result: {0} FMR_DIFF: {1} FNMR_DIFF: {2} SCORE_DIFF: {3}", defFile.FileId, defFile.FMR - testFile.FMR, defFile.FNMR - testFile.FNMR, defFile.Score - testFile.Score));
                            break;
                        }
                        else
                        {
                            checkALL = true;                      
                        }                        
                    }
                    if (checkALL == true)
                    {
                        ResultBuilder.AppendLine(String.Format("result: {0} FMR_DIFF: {1} FNMR_DIFF: {2} SCORE_DIFF: {3}", defFile.FileId, "N/A", "N/A", "N/A"));

                    }
                }

                //File processing completed.
                using (StreamWriter outfile = new StreamWriter(ResultFile))
                {
                    outfile.Write(ResultBuilder.ToString());
                }
            }
            catch (Exception ex)
            {
                throw ex;
            }
        }

Call the following method. 调用以下方法。

 getDiff(@"I:\Default_DeDe_operational_points_verbose.txt",
         @"I:\FrFr_operational_points_verbose.txt", 
         @"I:\Result.txt");

Thanks, Ajit 谢谢,Ajit

You have to specify which lines must be in output: every "file".csv from default? 您必须指定输出中必须包含哪些行:默认情况下每个“文件”.csv? every from current? 从现在开始? Both (if one is missing in one of the 2 files, the output must still contain this csv)? 两者(如果两个文件中的一个中缺少一个,输出必须仍然包含此csv)?

Once you know that, you can implement your logic: 一旦你知道,你可以实现你的逻辑:

  • Create a class (named FileLine for example) with the properties of a line, that is to say: a string for name (name of this string: CsvName), a decimal for FMR (say FmrValue), a decimal for FNMR (FnmrValue), an int for SCORE (ScoreValue) 使用行的属性创建一个类(例如名为FileLine),也就是说:name的字符串(此字符串的名称:CsvName),FMR的小数(比如FmrValue),FNMR的小数(FnmrValue) ,SCORE的一个int(ScoreValue)
  • Create a method for the process. 为流程创建方法。 It will: 它会:

  • Check the structure of the Current file: if not valid, stop process 检查当前文件的结构:如果无效,请停止进程

  • Create a new List called defaultLines 创建一个名为defaultLines的新列表
  • Create a new List called currentLines 创建一个名为currentLines的新列表
  • Create a string called processedLine (will be used in future step) 创建一个名为processedLine的字符串(将在以后的步骤中使用)
  • Read Default file: foreach line, create a FileLine, parse the line and implement the properties of your FileLine, and add the fileLine to the list (defaultLines) 读取默认文件:foreach行,创建FileLine,解析行并实现FileLine的属性,并将fileLine添加到列表中(defaultLines)
  • Read current file: foreach line, create a FileLine, parse the line and implement the properties of your FileLine, and add the fileLine to the list (currentLines) 读取当前文件:foreach行,创建FileLine,解析行并实现FileLine的属性,并将fileLine添加到列表中(currentLines)
  • Then process the comparison (see after) 然后处理比较(见后)

    public void comparisonGenerator() { public void comparisonGenerator(){

     // HERE: add currentFile check // Initialization List<FileLine> defaultLines = new List<FileLine>(); List<FileLine> currentLines = new List<FileLine>(); // HERE: add file reading to populate defaultLines and currentLines // Comparison foreach(FileLine item in defaultLines) { // for the item with the same name (using Linq, you could do it easily): FileLine cLine = currentLines.Single(l => l.CsvName.Equals(item.CsvName)); if(cLine != null) { processedLine = String.Format("result: {0} FMR_DIFF: {1} FNMR_DIFF: {2} SCORE_DIFF: {3}", item.CsvName, item.FmrValue - cLine.FmrValue, item.FnmrValue - cLine.FnmrValue, item.ScoreValue - cLine.ScoreValue); // HERE: add this line to future output } } // When all lines are processed, write the output to a file using FileStream 

    } }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM