简体   繁体   English

读取要合并的多个文本文件的相应行

[英]Reading respective lines of multiple text files to be combined

Firstly, I'm new to C# and I'm having a hard time figuring out how I'd go about doing this. 首先,我是C#的新手,我很难弄清楚该怎么做。

Basically, I have multiple text files all with their own types of data. 基本上,我有多个文本文件,它们都有自己的数据类型。 My aim is to read the first line of each of these files and combine them into one string so that I can sort them later by their respective days. 我的目的是读取每个文件的第一行,并将它们组合为一个字符串,以便稍后可以按各自的日期对其进行排序。

For example, in the first line of each file there could be the values... 例如,在每个文件的第一行中,可能会有值...

  • File 1: 16/02/15 文件1:16/02/15
  • File 2: Monday 文件2:星期一
  • File 3: 75.730 档案3:75.730
  • File 4: 0.470 档案4:0.470
  • File 5: 75.260 档案5:75.260
  • File 6: 68182943 档案6:68182943

So I'd like to combine them in a string like so "16/02/15 Monday 75.730 0.470 75.260 68182943" 所以我想将它们组合成一个字符串,例如“ 15/02/15 Monday 75.730 0.470 75.260 68182943”

I'd also want to do this for the second, third, fourth line etc. There are a total of 144 entries or lines. 我也想针对第二,第三,第四行执行此操作。共有144个条目或行。

Here is the code I have so far. 这是我到目前为止的代码。 I'm unsure if I'm on the right track. 我不确定我是否走对了。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;

namespace BankAlgorithms
{
    class Algorithms
    {
        static void Main(string[] args)
        {
            //Saves each individual text file into their own string arrays.
            string[] Day = File.ReadAllLines(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files\Day.txt");
            string[] Date = File.ReadAllLines(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files\Date.txt");
            string[] Close = File.ReadAllLines(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files\SH1_Close.txt");
            string[] Diff = File.ReadAllLines(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files\SH1_Diff.txt");
            string[] Open = File.ReadAllLines(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files\SH1_Open.txt");
            string[] Volume = File.ReadAllLines(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files\SH1_Volume.txt");

            //Lists all files currently stored within the directory
            string[] bankFiles = Directory.GetFiles(@"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files");

            Console.WriteLine("Bank Files currently saved within directory:\n");
            foreach (string name in bankFiles)
            {
                Console.WriteLine(name);
            }

            Console.WriteLine("\nSelect the day you wish to view the data of (Monday-Friday). To view a grouped \nlist of all days, enter \"Day\"\n");
            string selectedArray = Console.ReadLine();

            if (selectedArray == "Day")
            {
                Console.WriteLine("Opening Day File...");
                Console.WriteLine("\nDays grouped up in alphabetical order\n");

                var sort = from s in Day
                           orderby s
                           select s;
                foreach (string c in sort)
                {
                    Console.WriteLine(c);
                }
            }
            Console.ReadLine();
        }
    }
}

Access your file strings from a collection, use this code to read from each file and use a StringBuilder to build your string. 从集合访问您的文件字符串,使用此代码从每个文件中读取并使用StringBuilder生成您的字符串。

Read only the first few lines of text from a file 仅读取文件中的前几行文本

var builder = new StringBuilder();
foreach(var file in fileList)
{
    using (StreamReader reader = new StreamReader(file))
    {
        builder.Append(reader.ReadLine());
    }
}

return builder.ToString();

You could use following approach: put all in an string[][] first, then it's easier: 您可以使用以下方法:首先将所有内容都放在一个string[][] ,这样会更容易:

string[][] all = { Day, Date, Close, Diff, Open, Volume };

To get the minimum length of all: 要获得所有的最小长度:

int commonRange = all.Min(arr => arr.Length);

Now this is all you need: 现在,这就是您所需要的:

string[] merged = Enumerable.Range(0, commonRange)
    .Select(i => string.Join(" ", all.Select(arr => arr[i])))
    .ToArray();

This is similar to a for -loop from 0 to commonRange where you access all arrays with the same index and use String.Join to get a single string from all files' lines. 这类似于从0到commonRangefor -loop,在其中您访问具有相同索引的所有数组并使用String.Join从所有文件的行中获取单个string


Since you have commented that you want to merge only the lines of a specific day: 由于您已评论说您只想合并特定日期的行:

var lineIndexes = Day.Take(commonRange)
    .Select((line, index) => new { line, index })
    .Where(x => x.line.TrimStart().StartsWith("Monday", StringComparison.InvariantCultureIgnoreCase))
    .Select(x => x.index);

string[] merged = lineIndexes
    .Select(i => string.Join(" ", all.Select(arr => arr[i])))
    .ToArray();

So this might be a little more than you strictly need, but I think it'll be robust, quite flexible and be able to handle huge files if need be. 因此,这可能超出了您的严格要求,但我认为它会很健壮,非常灵活,并且在需要时能够处理大文件。

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace ConsoleApplication2
{
class Program
{
    private static void Main(string[] args)
    {
      //  const string folder = @"C:\Users\computing\Desktop\algorithms\CMP1124M_Assigment_Files";
        const string folder = @"C:\Temp\SO";
        var filenames = new[] { @"Date.txt", @"Day.txt", @"SH1_Close.txt", @"SH1_Diff.txt", @"SH1_Open.txt", @"SH1_Volume.txt" };

        var dataCombiner = new DataCombiner(folder, filenames);
        var stockParser = new StockParser();
        foreach (var stock in dataCombiner.GetCombinedData(stockParser.Parse)) //can also use where clause here
        { 
            if (ShowRow(stock))
            {
                var outputText = stock.ToString();
                Console.WriteLine(outputText);
            }
        }

        Console.ReadLine();
    }

    private static bool ShowRow(Stock stock)
    {
        //use input from user etc...
        return     (stock.DayOfWeek == "Tuesday" || stock.DayOfWeek == "Monday")
                && stock.Volume > 1000 
                && stock.Diff < 10; // etc
    }
}

internal class DataCombiner
{
    private readonly string _folder;
    private readonly string[] _filenames;

    public DataCombiner(string folder, string[] filenames)
    {
        _folder = folder;
        _filenames = filenames;
    }

    private static IEnumerable<string> GetFilePaths(string folder, params string[] filenames)
    {
        return filenames.Select(filename => Path.Combine(folder, filename));
    }

    public IEnumerable<T> GetCombinedData<T>(Func<string[], T> parserMethod) where T : class
    {
        var filePaths = GetFilePaths(_folder, _filenames).ToArray();
        var files = filePaths.Select(filePath => new StreamReader(filePath)).ToList();

        var lineCounterFile = new StreamReader(filePaths.First());
        while (lineCounterFile.ReadLine() != null)// This can be replaced with a simple for loop if the files will always have a fixed number of rows
        {
            var rawData = files.Select(file => file.ReadLine()).ToArray();
            yield return parserMethod(rawData);
        }
    }
}

internal class Stock
{
    public DateTime Date { get; set; }
    public string DayOfWeek { get; set; }
    public double Open { get; set; }
    public double Close { get; set; }
    public double Diff { get; set; }
    public int Volume { get; set; }

    public override string ToString()
    {
        //Whatever format you want
        return string.Format("{0:d} {1} {2} {3} {4} {5}", Date, DayOfWeek, Close, Diff, Open, Volume);
    }
}

internal class StockParser
{
    public Stock Parse(string[] rawData)
    {
        //TODO: Error handling required here
        var stock = new Stock();
        stock.Date = DateTime.Parse(rawData[0]);
        stock.DayOfWeek = rawData[1];
        stock.Close = double.Parse(rawData[2]);
        stock.Diff = double.Parse(rawData[3]);
        stock.Open = double.Parse(rawData[4]);
        stock.Volume = int.Parse(rawData[5]); 
        return stock;
    }

    public string ParseToRawText(string[] rawData)
    {
        return string.Join(" ", rawData);
    }
}
}

PS: PS:
Instead of reading it from the file, I'd rather also calculate the DayOfWeek from the Date. 与其从文件中读取数据,不如从日期中计算DayOfWeek。 Also be careful when parsing dates from a different locale (eg. USA vs UK). 在解析来自其他语言环境的日期时也要小心(例如,美国与英国)。 If you have an option I'd only use the ISO 8601 datetime format. 如果您有一个选择,我只会使用ISO 8601日期时间格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM