简体   繁体   English

从文件C#中读取double值

[英]Read double value from a file C#

I have a txt file that the format is: 我有一个txt文件,其格式为:

0.32423 1.3453 3.23423
0.12332 3.1231 9.23432432
9.234324234 -1.23432 12.23432
...

Each line has three double value. 每行具有三个double值。 There are more than 10000 lines in this file. 此文件中有10000多行。 I can use the ReadStream.ReadLine and use the String.Split, then convert it. 我可以使用ReadStream.ReadLine并使用String.Split,然后将其转换。 I want to know is there any faster method to do it. 我想知道有没有更快的方法可以做到这一点。

Best Regards, 最好的祝福,

StreamReader.ReadLine , String.Split and Double.TryParse sounds like a good solution here. 在这里, StreamReader.ReadLineString.SplitDouble.TryParse听起来是一个不错的解决方案。
No need for improvement. 无需改进。

There may be some little micro-optimisations you can perform, but the way you've suggested sounds about as simple as you'll get. 您可能会执行一些微优化,但是建议的方法听起来很简单。

10000 lines shouldn't take very long - have you tried it and found you've actually got a performance problem? 10000行不应该花费很长时间-您是否尝试过,发现实际上遇到了性能问题? For example, here are two short programs - one creates a 10,000 line file and the other reads it: 例如,这里有两个简短的程序-一个创建一个10,000行的文件,另一个读取它:

CreateFile.cs: CreateFile.cs:

using System;
using System.IO;

public class Test
{
    static void Main()
    {
        Random rng = new Random();
        using (TextWriter writer = File.CreateText("test.txt"))
        {
            for (int i = 0; i < 10000; i++)
            {
                writer.WriteLine("{0} {1} {2}", rng.NextDouble(),
                                 rng.NextDouble(), rng.NextDouble());
            }
        }
    }
}

ReadFile.cs: ReadFile.cs:

using System;
using System.Diagnostics;
using System.IO;
using System.Linq;

public class Test
{
    static void Main()
    {   
        Stopwatch sw = Stopwatch.StartNew();
        using (TextReader reader = File.OpenText("test.txt"))
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                string[] bits = line.Split(' ');
                foreach (string bit in bits)
                {
                    double value;
                    if (!double.TryParse(bit, out value))
                    {
                        Console.WriteLine("Bad value");
                    }
                }
            }
        }
        sw.Stop();
        Console.WriteLine("Total time: {0}ms",
                          sw.ElapsedMilliseconds);
    }
}

On my netbook (which admittedly has an SSD in) it only takes 82ms to read the file. 在我的上网本上(公认有SSD),只需要82毫秒即可读取文件。 I would suggest that's probably not a problem :) 我建议这可能不是问题:)

I would suggest reading all your lines at once with 我建议您一次阅读所有行

string[] lines = System.IO.File.ReadAllLines(fileName);

This wold ensure that the I/O is done with the maximum efficiency. 这样可以确保以最高效率完成I / O。 You woul have to measure (profile) but I would expect the conversions to take far less time. 您将必须进行测量(配置文件),但我希望转换花费的时间会少得多。

This solution is a little bit slower (see benchmarks at the end), but its nicer to read. 该解决方案要慢一些(请参阅最后的基准测试),但它读起来更好。 It should also be more memory efficient because only the current character is buffered at the time (instead of the whole file or line). 它还应该提高内存效率,因为此时仅缓冲当前字符(而不是整个文件或行)。

Reading arrays is an additional feature in this reader which assumes that the size of the array always comes first as an int-value. 读取数组是该阅读器的一项附加功能,它假定数组的大小始终始终是int值。

IParsable is another feature, that makes it easy to implement Parse methods for various types. IParsable是另一个功能,可轻松实现各种类型的Parse方法。

class StringSteamReader {
    private StreamReader sr;

    public StringSteamReader(StreamReader sr) {
        this.sr = sr;
        this.Separator = ' ';
    }

    private StringBuilder sb = new StringBuilder();
    public string ReadWord() {
        eol = false;
        sb.Clear();
        char c;
        while (!sr.EndOfStream) {
            c = (char)sr.Read();
            if (c == Separator) break;
            if (IsNewLine(c)) {
                eol = true;
                char nextch = (char)sr.Peek();
                while (IsNewLine(nextch)) {
                    sr.Read(); // consume all newlines
                    nextch = (char)sr.Peek();
                }
                break;
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private bool IsNewLine(char c) {
        return c == '\r' || c == '\n';
    }

    public int ReadInt() {
        return int.Parse(ReadWord());
    }

    public double ReadDouble() {
        return double.Parse(ReadWord());
    }

    public bool EOF {
        get { return sr.EndOfStream; }
    }

    public char Separator { get; set; }

    bool eol;
    public bool EOL {
        get { return eol || sr.EndOfStream; }
    }

    public T ReadObject<T>() where T : IParsable, new() {
        var obj = new T();
        obj.Parse(this);
        return obj;
    }

    public int[] ReadIntArray() {
        int size = ReadInt();
        var a = new int[size];
        for (int i = 0; i < size; i++) {
            a[i] = ReadInt();
        }
        return a;
    }

    public double[] ReadDoubleArray() {
        int size = ReadInt();
        var a = new double[size];
        for (int i = 0; i < size; i++) {
            a[i] = ReadDouble();
        }
        return a;
    }

    public T[] ReadObjectArray<T>() where T : IParsable, new() {
        int size = ReadInt();
        var a = new T[size];
        for (int i = 0; i < size; i++) {
            a[i] = ReadObject<T>();
        }
        return a;
    }

    internal void NextLine() {
        eol = false;
    }
}

interface IParsable {
    void Parse(StringSteamReader r);
}

It can be used like this: 可以这样使用:

public void Parse(StringSteamReader r) {
    double x = r.ReadDouble();
    int y = r.ReadInt();
    string z = r.ReadWord();
    double[] arr = r.ReadDoubleArray();
    MyParsableObject o = r.ReadObject<MyParsableObject>();
    MyParsableObject [] oarr = r.ReadObjectArray<MyParsableObject>();
}

I did some benchmarking, comparing StringStreamReader with some other approaches, already proposed ( StreamReader.ReadLine and File.ReadAllLines ). 我做了一些基准测试,将StringStreamReader与其他已经提出的方法( StreamReader.ReadLineFile.ReadAllLines )进行了比较。 Here are the methods I used for benchmarking: 这是我用于基准测试的方法:

private static void Test_StringStreamReader(string filename) {
    var sw = new Stopwatch();
    sw.Start();
    using (var sr = new StreamReader(new FileStream(filename, FileMode.Open, FileAccess.Read))) {
        var r = new StringSteamReader(sr);
        r.Separator = ' ';
        while (!r.EOF) {
            var dbls = new List<double>();
            while (!r.EOF) {
                dbls.Add(r.ReadDouble());
            }
        }
    }
    sw.Stop();
    Console.WriteLine("elapsed: {0}", sw.Elapsed);
}

private static void Test_ReadLine(string filename) {
    var sw = new Stopwatch();
    sw.Start();
    using (var sr = new StreamReader(new FileStream(filename, FileMode.Open, FileAccess.Read))) {
        var dbls = new List<double>();

        while (!sr.EndOfStream) {
            string line = sr.ReadLine();
            string[] bits = line.Split(' ');
            foreach(string bit in bits) {
                dbls.Add(double.Parse(bit));
            }
        }
    }
    sw.Stop();
    Console.WriteLine("elapsed: {0}", sw.Elapsed);
}

private static void Test_ReadAllLines(string filename) {
    var sw = new Stopwatch();
    sw.Start();
    string[] lines = System.IO.File.ReadAllLines(filename);
    var dbls = new List<double>();
    foreach(var line in lines) {
        string[] bits = line.Split(' ');
        foreach (string bit in bits) {
            dbls.Add(double.Parse(bit));
        }
    }        
    sw.Stop();
    Console.WriteLine("Test_ReadAllLines: {0}", sw.Elapsed);
}

I used a file with 1.000.000 lines of double values (3 values each line). 我使用了具有1.000.000行的双精度值的文件(每行3个值)。 File is located on a SSD disk and each test was repeated multiple times in release-mode. 文件位于SSD磁盘上,每个测试在发布模式下重复了多次。 These are the results (on average): 这些是结果(平均):

Test_StringStreamReader: 00:00:01.1980975
Test_ReadLine:           00:00:00.9117553
Test_ReadAllLines:       00:00:01.1362452

So, as mentioned StringStreamReader is a bit slower than the other approaches. 因此,如上所述, StringStreamReader比其他方法要慢一些。 For 10.000 lines, the performance is around (120ms / 95ms / 100ms). 对于10.000条线,性能约为(120ms / 95ms / 100ms)。

your method is already good! 您的方法已经很好!

you can improve it by writing a readline function that returns an array of double and you reuse this function in other programs. 您可以编写一个readline函数来返回一个double数组,并在其他程序中重用此函数,以改进它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM