简体   繁体   English

从文本到表的C#.net数据挖掘

[英]C# .net data mining from text to table

Previously we used to get data on mail in the following format; 以前,我们以前是通过以下格式获取邮件数据的; we had to extract the important data parts from the text(body) of the mail and put it in a table. 我们必须从邮件的正文中提取重要的数据部分,并将其放在表格中。

 type         date               size

weekly    04/05/2012 16.03.03     388

I am capturing the value like this First by an array: 我正在通过数组捕获这样的值:

string[] orderOfValues = new string[3];
orderOfValues[0] = "TYPE";
orderOfValues[1] = "DATE";
orderOfValues[2] = "SIZE";

and then in dictionary 然后在字典中

sdValues = new StringDictionary();

Then extracting the fields by splitting; 然后通过拆分提取字段;

sdValues.Add("TYPE", field1);
sdValues.Add("DATE", field2);
sdValues.Add("SIZE", field3);

Now the upstream have changed the data to send multiple rows. 现在,上游已将数据更改为发送多行。

 type         date               size

weekly    04/05/2012 16.03.03     388
daily     04/07/2012 17.03.03     14
weekly    04/08/2012 19.03.03     643

Since the number of rows is now dynamic, please advice as to how to go ahead 由于行数现在是动态的,因此请提出建议

OK, so you take that file, break it into fields and then write out the fields separated by a |. OK,因此,您将该文件拿出来,分成几个字段,然后写出用|分隔的字段。

Seems to me you need to go over each input line and do the following: 在我看来,您需要遍历每条输入线并执行以下操作:

String[] parts = inputLine.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
String outputLine = String.Join("|", parts);

Write the outputLines to your flat file. 将outputLines写入您的平面文件。 You don't need a dictionary. 您不需要字典。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM