简体   繁体   中英

What's the best way to get all the content in between two tagged lines of a file so that you can deserialize it?

I've been noticing that the following segment of code does not scale well for large files (I think that appending to the paneContent string is slow):

            string paneContent = String.Empty;
            bool lineFound = false;
            foreach (string line in File.ReadAllLines(path))
            {
                if (line.Contains(tag))
                {
                    lineFound = !lineFound;
                }
                else
                {
                    if (lineFound)
                    {
                        paneContent += line;
                    }
                }
            }
            using (TextReader reader = new StringReader(paneContent))
            {
                data = (PaneData)(serializer.Deserialize(reader));
            }

What's the best way to speed this all up? I have a file that looks like this (so I wanna get all the content in between the two different tags and then deserialize all that content):

A line with some tag 
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with some tag

Note: These tags are not XML tags.

You could use a StringBuilder as opposed to a string, that is what the StringBuilder is for. Some example code is below:

var paneContent = new StringBuilder();
bool lineFound = false;
foreach (string line in File.ReadLines(path))
{
    if (line.Contains(tag))
    {
        lineFound = !lineFound;
    }
    else
    {
        if (lineFound)
        {
            paneContent.Append(line);
        }
    }
}
using (TextReader reader = new StringReader(paneContent.ToString()))
{
    data = (PaneData)(serializer.Deserialize(reader));
}

As mentioned in this answer , a StringBuilder is preferred to a string when you are concatenating in a loop, which is the case here.

Here is an example of how to use groups with regexes and retrieve their contents afterwards.

What you want is a regex that will match your tags, label this as a group then retrieve the data of the group as in the example

Use a StringBuilder to build your data string ( paneContent ). It's much faster because concatenating strings results in new memory allocations. StringBuilder pre-allocates memory (if you expect large data strings, you can customize the initial allocation).

It's a good idea to read your input file line-by-line so you can avoid loading the whole file into memory if you expect files with many lines of text.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM