简体   繁体   English

在文件的两个带标签的行之间获取所有内容以便可以反序列化的最佳方法是什么?

[英]What's the best way to get all the content in between two tagged lines of a file so that you can deserialize it?

I've been noticing that the following segment of code does not scale well for large files (I think that appending to the paneContent string is slow): 我一直注意到以下代码段不能很好地用于大型文件(我认为追加到paneContent字符串很慢):

            string paneContent = String.Empty;
            bool lineFound = false;
            foreach (string line in File.ReadAllLines(path))
            {
                if (line.Contains(tag))
                {
                    lineFound = !lineFound;
                }
                else
                {
                    if (lineFound)
                    {
                        paneContent += line;
                    }
                }
            }
            using (TextReader reader = new StringReader(paneContent))
            {
                data = (PaneData)(serializer.Deserialize(reader));
            }

What's the best way to speed this all up? 加快这一切的最好方法是什么? I have a file that looks like this (so I wanna get all the content in between the two different tags and then deserialize all that content): 我有一个看起来像这样的文件(所以我想在两个不同的标签之间获取所有内容,然后反序列化所有内容):

A line with some tag 
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with some tag

Note: These tags are not XML tags. 注意:这些标记不是XML标记。

You could use a StringBuilder as opposed to a string, that is what the StringBuilder is for. 您可以使用StringBuilder而不是字符串,这就是StringBuilder的目的。 Some example code is below: 下面是一些示例代码:

var paneContent = new StringBuilder();
bool lineFound = false;
foreach (string line in File.ReadLines(path))
{
    if (line.Contains(tag))
    {
        lineFound = !lineFound;
    }
    else
    {
        if (lineFound)
        {
            paneContent.Append(line);
        }
    }
}
using (TextReader reader = new StringReader(paneContent.ToString()))
{
    data = (PaneData)(serializer.Deserialize(reader));
}

As mentioned in this answer , a StringBuilder is preferred to a string when you are concatenating in a loop, which is the case here. 就像在这个答案中提到的那样,当您在循环中串联时,StringBuilder优先于字符串,在这种情况下就是这样。

Here is an example of how to use groups with regexes and retrieve their contents afterwards. 这是一个如何在正则表达式中使用组并随后检索其内容的示例。

What you want is a regex that will match your tags, label this as a group then retrieve the data of the group as in the example 您想要的是一个与标签匹配的正则表达式,将其标记为组,然后如示例中那样检索组的数据

Use a StringBuilder to build your data string ( paneContent ). 使用StringBuilder生成数据字符串( paneContent )。 It's much faster because concatenating strings results in new memory allocations. 因为连接字符串会导致新的内存分配,所以速度要快得多。 StringBuilder pre-allocates memory (if you expect large data strings, you can customize the initial allocation). StringBuilder预先分配内存(如果您期望大型数据字符串,则可以自定义初始分配)。

It's a good idea to read your input file line-by-line so you can avoid loading the whole file into memory if you expect files with many lines of text. 逐行读取输入文件是一个好主意,因此,如果您希望文件包含多行文本,则可以避免将整个文件加载到内存中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用实体框架获取数据以便可以在对象之间导航的正确方法是什么? - What's the proper way to get data using Entity Framework so you can navigate through objects? 使用Linq在两个列表之间进行规范化的最佳方法是什么? - What's the best way to normalize between two lists using Linq? 获取所有NodieId或所有NodeAliasPaths API 7的最佳方法是什么 - What's best way to get all NodieIds or all NodeAliasPaths API 7 在WinMD文件中获取名称空间的最佳方法是什么? - What's the best way to get the namespaces in a WinMD file? 顺序获取文件中行的最好方法是? - The best way to get lines in a file in order? 用XML表示RESTful服务的List的最佳方法是什么,这样反序列化将更容易 - What is the best way to represent List in XML for a RESTful Service so that it will be easier to deserialize 在C#中序列化和反序列化XML以便重用XML文件,模型类和UserControl的最佳方法 - Best way to serialize and deserialize XML in C# so as to reuse the XML file, model class, and UserControl 在两个C#应用程序(32位和64位)之间进行IPC的最佳方法是什么? - What's the best way to IPC between two C# apps (32bit and 64bit) 更新文件中xml的最佳方法是什么? - What's the best way to update xml in a file? 什么是获取文件路径的最佳方法 - What could be the best way to get file path
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM