简体   繁体   English

正确解析文本文件的方法

[英]Correct way parsing a text file

I have a text file that is coming is some predefined way. 我有一个文本文件即将到来是一些预定义的方式。 I don't have something like it's xsd, but the pattern can be seen. 我没有像xsd这样的东西,但可以看到模式。

for example it looks something like this: 例如它看起来像这样:

[MyFIRSTPARAGRAPH]
NUM1 NUM2 NUM3 NUM4 NUM5 NUM6 NUM7 NUM8 NUM9 NUM10 NUM11
1 1 0.000 0.000 0.000 0 1 1 0 0 ""
2 2 22.800 0.000 0.000 0 1 1 0 0 ""
3 3 45.600 0.000 0.000 0 1 1 0 0 ""
4 4 68.400 0.000 0.000 0 1 1 0 0 ""
5 5 91.200 0.000 0.000 0 1 1 0 0 ""
6 6 0.000 32.800 0.000 0 1 1 0 0 ""
7 7 22.800 32.800 0.000 0 1 1 0 0 ""
8 8 45.600 32.800 0.000 0 1 1 0 0 ""
9 9 68.400 32.800 0.000 0 1 1 0 0 ""
10 10 91.200 32.800 0.000 0 1 1 0 0 "" 

A lot paragraphs separated by space lines. 很多段落用空格线分隔。

Any saggestion what it the best what to parse files like this and to extract the values from the text. 任何saggestion什么是最好的解析这样的文件并从文本中提取值。

My very first guess would be to do something like this: 我的第一个猜测是做这样的事情:

using(var reader = GetStreamReader())
{
    bool justReadATag = false;
    string line = string.Empty;

    while((line = reader.ReadLine()) != null)
    {
        if(IsTag(line)) 
        {
            // do some work with the paragraph tag
            justReadATag = true;
        }else{
            string[] parts = line.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
            if(justReadATag) 
            {
                // do some work with the column names
                justReadATag = false;    
            }else
            {
                // do some work with the cell values
            }
        }
    }
}

I would suggest read the complete file using File.ReadAllLines method. 我建议使用File.ReadAllLines方法读取完整的文件。 Now you can iterate all the lines one by one. 现在,您可以逐个迭代所有行。 Then for each line use String.Split(' ') to get the values which are separated by space in the line 然后对于每一行使用String.Split(' ')来获取由行中的空格分隔的值

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM