简体   繁体   English

将文本文件解析为具有不规则行的数据表

[英]parsing text file to data table with irregular rows

i am trying to parse a tabular data in a text file into a data table. 我试图将文本文件中的表格数据解析为数据表。

the text file contains text 文本文件包含文本

  PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
  11 root        1 171   52     0K    12K RUN     23:46 80.42% idle
  12 root        1 -20 -139     0K    12K RUN AS    0:56  7.96% swi7:

the code i have is like 我的代码就像

 public class Program
{
    static void Main(string[] args)
    {
        var lines = File.ReadLines("bb.txt").ToArray();
        var headerLine = lines[0];
        var dt = new DataTable();
        var columnsArray = headerLine.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
        var dataColumns = columnsArray.Select(item => new DataColumn { ColumnName = item });
        dt.Columns.AddRange(dataColumns.ToArray());
        for (int i = 1; i < lines.Length; i++)
        {
            var rowLine = lines[i];
            var rowArray = rowLine.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
            var x = dt.NewRow();
            x.ItemArray = rowArray;
            dt.Rows.Add(x);

        }
    }
}

i get an error that "Input array is longer than the number of columns in this table" at second attempt on 我在第二次尝试时收到错误消息“输入数组长于此表中的列数”

x.ItemArray = rowArray;

Off course because second row has "RUN AS" as the value of 8th column. 当然,因为第二行的第8列的值为“ RUN AS”。 it also has a space between it which is a common split character for the entire row hence creating a mismatch between array's length and columns length. 它之间也有一个空格,这是整个行的常见分隔符,因此会在数组的长度和列的长度之间造成不匹配。

what is the possible solution for this kind of situation. 这种情况下可能的解决方案是什么?

Assuming that "RUN AS" is your only string that causes you the condition like this, you could just run var sanitizedLine = rowLine.Replace("RUN AS", "RUNAS") before your split and then separate the words back out afterwards. 假设“ RUN AS”是导致这种情况的唯一字符串,则可以在拆分前运行var sanitizedLine = rowLine.Replace("RUN AS", "RUNAS") ,然后将单词分开。 If this happens more often, however, you may need to set a condition to check that the array generated by the split matches the length of the header, then combine the offending indexes in a new array of the correct length before attempting to add it. 但是,如果这种情况更经常发生,则可能需要设置一个条件,以检查由拆分生成的数组是否与标头的长度匹配,然后将有问题的索引合并到正确长度的新数组中,然后再尝试添加它。

Ideally, however, you would instead have whatever is generating your input file wrap strings in quotes to make your life easier. 但是,理想情况下,您将拥有生成引号中的输入文件自动换行字符串的所有内容,以使您的生活更轻松。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM