简体   繁体   中英

How to make conversion of txt file more efficient?

I'm trying to read in a txt file (csv or tab delimited) and convert each line into a Vector3 and add it to an array to then further process

My code so far works but it takes a while to read in a file. Each file being read in is between 6 mb to 25 mb

The code runs through and does what I expect it too but it seems to bottleneck somewhere in this foreach statement? Is there a quicker way or is it something that has to be expected?

String[] pntsText = File.ReadAllLines(args[0]);
List<Vector3> pnts = new List<Vector3>();
Console.WriteLine("Start Building Points Array ...");
int noOfPnts = pntsText.Length;
int currentPntNo=0;
Console.CursorVisible = false;

foreach (string pntText in pntsText)
{
    currentPntNo++;
    Console.Clear();
    Console.Write(noOfPnts - currentPntNo + " left to process");
    string[] splitXYZ = pntText.Split(new string[] { args[1] }, StringSplitOptions.None);
    Vector3 ve2 = new Vector3(float.Parse(splitXYZ[0]), float.Parse(splitXYZ[1]), float.Parse(splitXYZ[2]));
    pnts.Add(ve2);
}

Console.WriteLine("Points Array Complete");

I believe the issue is with your notification on console, You can comment them out and test if you get any better performance. I would suggest you to use Stopwatch to time your program execution.

You can also try the following LINQ query to get a list of Vector3 .

List<Vector3> list

 = pntsText.Select(r => new { Splitted = r.Split(new string[] { "," }, StringSplitOptions.None) })
           .Select(t => new Vector3(float.Parse(t.Splitted[0]), float.Parse(t.Splitted[1]), float.Parse(t.Splitted[2])))
           .ToList();

But this internally does the looping, so I am not sure if you get any performance gain from that and also you will not get the output on console during its processing.

You are using the Split methode to split your points:

string[] splitXYZ = pntText.Split(new string[] 
  { args[1] }, StringSplitOptions.None);

Having this in the for loop isnt really performant, since it allocates memory for the returned array object and a String object for each array element. Consider using the IndexOf combinied with Substring, im not sure how faster is it you will have to test this.

Read Documentation about this issue:

Performance Considerations

The Split methods allocate memory for the returned array object and a String object for each array element. If your application requires optimal performance or if managing memory allocation is critical in your application, consider using the IndexOf or IndexOfAny method, and optionally the Compare method, to locate a substring within a string.

If you are splitting a string at a separator character, use the IndexOf or IndexOfAny method to locate a separator character in the string. If you are splitting a string at a separator string, use the IndexOf or IndexOfAny method to locate the first character of the separator string. Then use the Compare method to determine whether the characters after that first character are equal to the remaining characters of the separator string.

Another point is that you are creating an object (Vector) for each returned points array including 3 times the Parse part, which costs some performance too:

Vector3 ve2 = new Vector3(float.Parse(splitXYZ[0]), 
   float.Parse(splitXYZ[1]), float.Parse(splitXYZ[2]));

If this isnt really needed at this point (depends on your needs) you can keep the information in form of text or even a struct and create the Vector object once you need to process it at later point.

Hope this helps

Read the whole file into one string and call str.Split(new[] {',', '\\n'}) to get a single array of all the vector parts. Then loop through, parsing them in 3s. This would prevent multiple calls to Split . Also avoid updating the console on every iteration. Maybe every 100th?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM