简体   繁体   中英

Read double value from a file C#

I have a txt file that the format is:

0.32423 1.3453 3.23423
0.12332 3.1231 9.23432432
9.234324234 -1.23432 12.23432
...

Each line has three double value. There are more than 10000 lines in this file. I can use the ReadStream.ReadLine and use the String.Split, then convert it. I want to know is there any faster method to do it.

Best Regards,

StreamReader.ReadLine , String.Split and Double.TryParse sounds like a good solution here.
No need for improvement.

There may be some little micro-optimisations you can perform, but the way you've suggested sounds about as simple as you'll get.

10000 lines shouldn't take very long - have you tried it and found you've actually got a performance problem? For example, here are two short programs - one creates a 10,000 line file and the other reads it:

CreateFile.cs:

using System;
using System.IO;

public class Test
{
    static void Main()
    {
        Random rng = new Random();
        using (TextWriter writer = File.CreateText("test.txt"))
        {
            for (int i = 0; i < 10000; i++)
            {
                writer.WriteLine("{0} {1} {2}", rng.NextDouble(),
                                 rng.NextDouble(), rng.NextDouble());
            }
        }
    }
}

ReadFile.cs:

using System;
using System.Diagnostics;
using System.IO;
using System.Linq;

public class Test
{
    static void Main()
    {   
        Stopwatch sw = Stopwatch.StartNew();
        using (TextReader reader = File.OpenText("test.txt"))
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                string[] bits = line.Split(' ');
                foreach (string bit in bits)
                {
                    double value;
                    if (!double.TryParse(bit, out value))
                    {
                        Console.WriteLine("Bad value");
                    }
                }
            }
        }
        sw.Stop();
        Console.WriteLine("Total time: {0}ms",
                          sw.ElapsedMilliseconds);
    }
}

On my netbook (which admittedly has an SSD in) it only takes 82ms to read the file. I would suggest that's probably not a problem :)

I would suggest reading all your lines at once with

string[] lines = System.IO.File.ReadAllLines(fileName);

This wold ensure that the I/O is done with the maximum efficiency. You woul have to measure (profile) but I would expect the conversions to take far less time.

This solution is a little bit slower (see benchmarks at the end), but its nicer to read. It should also be more memory efficient because only the current character is buffered at the time (instead of the whole file or line).

Reading arrays is an additional feature in this reader which assumes that the size of the array always comes first as an int-value.

IParsable is another feature, that makes it easy to implement Parse methods for various types.

class StringSteamReader {
    private StreamReader sr;

    public StringSteamReader(StreamReader sr) {
        this.sr = sr;
        this.Separator = ' ';
    }

    private StringBuilder sb = new StringBuilder();
    public string ReadWord() {
        eol = false;
        sb.Clear();
        char c;
        while (!sr.EndOfStream) {
            c = (char)sr.Read();
            if (c == Separator) break;
            if (IsNewLine(c)) {
                eol = true;
                char nextch = (char)sr.Peek();
                while (IsNewLine(nextch)) {
                    sr.Read(); // consume all newlines
                    nextch = (char)sr.Peek();
                }
                break;
            }
            sb.Append(c);
        }
        return sb.ToString();
    }

    private bool IsNewLine(char c) {
        return c == '\r' || c == '\n';
    }

    public int ReadInt() {
        return int.Parse(ReadWord());
    }

    public double ReadDouble() {
        return double.Parse(ReadWord());
    }

    public bool EOF {
        get { return sr.EndOfStream; }
    }

    public char Separator { get; set; }

    bool eol;
    public bool EOL {
        get { return eol || sr.EndOfStream; }
    }

    public T ReadObject<T>() where T : IParsable, new() {
        var obj = new T();
        obj.Parse(this);
        return obj;
    }

    public int[] ReadIntArray() {
        int size = ReadInt();
        var a = new int[size];
        for (int i = 0; i < size; i++) {
            a[i] = ReadInt();
        }
        return a;
    }

    public double[] ReadDoubleArray() {
        int size = ReadInt();
        var a = new double[size];
        for (int i = 0; i < size; i++) {
            a[i] = ReadDouble();
        }
        return a;
    }

    public T[] ReadObjectArray<T>() where T : IParsable, new() {
        int size = ReadInt();
        var a = new T[size];
        for (int i = 0; i < size; i++) {
            a[i] = ReadObject<T>();
        }
        return a;
    }

    internal void NextLine() {
        eol = false;
    }
}

interface IParsable {
    void Parse(StringSteamReader r);
}

It can be used like this:

public void Parse(StringSteamReader r) {
    double x = r.ReadDouble();
    int y = r.ReadInt();
    string z = r.ReadWord();
    double[] arr = r.ReadDoubleArray();
    MyParsableObject o = r.ReadObject<MyParsableObject>();
    MyParsableObject [] oarr = r.ReadObjectArray<MyParsableObject>();
}

I did some benchmarking, comparing StringStreamReader with some other approaches, already proposed ( StreamReader.ReadLine and File.ReadAllLines ). Here are the methods I used for benchmarking:

private static void Test_StringStreamReader(string filename) {
    var sw = new Stopwatch();
    sw.Start();
    using (var sr = new StreamReader(new FileStream(filename, FileMode.Open, FileAccess.Read))) {
        var r = new StringSteamReader(sr);
        r.Separator = ' ';
        while (!r.EOF) {
            var dbls = new List<double>();
            while (!r.EOF) {
                dbls.Add(r.ReadDouble());
            }
        }
    }
    sw.Stop();
    Console.WriteLine("elapsed: {0}", sw.Elapsed);
}

private static void Test_ReadLine(string filename) {
    var sw = new Stopwatch();
    sw.Start();
    using (var sr = new StreamReader(new FileStream(filename, FileMode.Open, FileAccess.Read))) {
        var dbls = new List<double>();

        while (!sr.EndOfStream) {
            string line = sr.ReadLine();
            string[] bits = line.Split(' ');
            foreach(string bit in bits) {
                dbls.Add(double.Parse(bit));
            }
        }
    }
    sw.Stop();
    Console.WriteLine("elapsed: {0}", sw.Elapsed);
}

private static void Test_ReadAllLines(string filename) {
    var sw = new Stopwatch();
    sw.Start();
    string[] lines = System.IO.File.ReadAllLines(filename);
    var dbls = new List<double>();
    foreach(var line in lines) {
        string[] bits = line.Split(' ');
        foreach (string bit in bits) {
            dbls.Add(double.Parse(bit));
        }
    }        
    sw.Stop();
    Console.WriteLine("Test_ReadAllLines: {0}", sw.Elapsed);
}

I used a file with 1.000.000 lines of double values (3 values each line). File is located on a SSD disk and each test was repeated multiple times in release-mode. These are the results (on average):

Test_StringStreamReader: 00:00:01.1980975
Test_ReadLine:           00:00:00.9117553
Test_ReadAllLines:       00:00:01.1362452

So, as mentioned StringStreamReader is a bit slower than the other approaches. For 10.000 lines, the performance is around (120ms / 95ms / 100ms).

your method is already good!

you can improve it by writing a readline function that returns an array of double and you reuse this function in other programs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM