简体   繁体   English

从文本文件中读取特定的字符串

[英]Read Specific Strings from Text File

I'm trying to get certain strings out of a text file and put it in a variable. 我正在尝试从文本文件中获取某些字符串,并将其放入变量中。 This is what the structure of the text file looks like keep in mind this is just one line and each line looks like this and is separated by a blank line: 这是文本文件的结构,请记住,这只是一行,而每一行都是这样的,并由空白行分隔:

Date: 8/12/2013 12:00:00 AM Source Path: \\build\PM\11.0.64.1\build.11.0.64.1.FileServerOutput.zip Destination Path: C:\Users\Documents\.NET Development\testing\11.0.64.1\build.11.0.55.5.FileServerOutput.zip Folder Updated: 11.0.64.1 File Copied: build.11.0.55.5.FileServerOutput.zip

I wasn't entirely too sure of what to use for a delimiter for this text file or even if I should be using a delimiter so it could be subjected to change. 我不太确定该文本文件的定界符使用什么,或者我是否应该使用定界符,以便对其进行更改,我也不是很确定。

So just a quick example of what I want to happen with this, is I want to go through and grab the Destination Path and store it in a variable such as strDestPath. 因此,这是我要处理的一个简单示例,我想遍历并获取“目标路径”并将其存储在诸如strDestPath这样的变量中。

Overall the code I came up with so far is this: 总的来说,到目前为止我想到的代码是:

//find the variables from the text file
string[] lines = File.ReadAllLines(GlobalVars.strLogPath);

Yeah not much, but I thought perhaps if I just read one line at at a time and tried to search for what I was looking for through that line but honestly I'm not 100% sure if I should stick with that way or not... 是的,不是很多,但是我想也许我一次只能读一行,并试图通过该行搜索我想要的内容,但是老实说,我不确定100%是否应该坚持这种方式。 ..

If you are skeptical about how large your file is, you should come up using ReadLines which is deferred execution instead of ReadAllLines : 如果您对文件的大小有所怀疑,则应该使用延迟执行的 ReadLines而不是ReadAllLines

var lines = File.ReadLines(GlobalVars.strLogPath);

The ReadLines and ReadAllLines methods differ as follows: ReadLinesReadAllLines方法的区别如下:

When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; 当您使用ReadLines时,可以在返回整个集合之前开始枚举字符串的集合。 when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. 使用ReadAllLines时,必须等待返回整个字符串数组,然后才能访问该数组。 Therefore, when you are working with very large files, ReadLines can be more efficient. 因此,当您处理非常大的文件时,ReadLines可能会更有效率。

As weird as it might sound, you should take a look to log parser. 听起来很奇怪,您应该看看日志解析器。 If you are free to set the file format you could use one that fits with log parser and, believe me, it will make your life a lot more easy. 如果您可以自由设置文件格式,则可以使用适合于日志解析器的文件格式,相信我,它将使您的生活更加轻松。

Once you load the file with log parse you can user queries to get the information you want. 使用日志解析加载文件后,您可以通过用户查询来获取所需的信息。 If you don't care about using interop in your project you can even add a com reference and use it from any .net project. 如果您不希望在项目中使用互操作,甚至可以添加com引用并从任何.net项目中使用它。

This sample reads a HUGE csv file a makes a bulkcopy to the DB to perform there the final steps. 该示例读取一个巨大的csv文件,并将其复制到数据库以在其中执行最后的步骤。 This is not really your case, but shows you how easy is to do this with logparser 这实际上不是您的情况,但向您展示了使用logparser进行此操作有多么容易

COMTSVInputContextClass logParserTsv = new COMTSVInputContextClass();
COMSQLOutputContextClass logParserSql = new COMSQLOutputContextClass();
logParserTsv.separator = ";";
logParserTsv.fixedSep = true;

logParserSql.database = _sqlDatabaseName;
logParserSql.server = _sqlServerName;
logParserSql.username = _sqlUser;
logParserSql.password = _sqlPass;
logParserSql.createTable = false;
logParserSql.ignoreIdCols = true;

// query shortened for clarity purposes
string SelectPattern = @"Select  TO_STRING(UserName),TO_STRING(UserID) INTO {0}  From {1}";

string query = string.Format(SelectPattern, _sqlTable, _csvPath);
logParser.ExecuteBatch(query, logParserTsv, logParserSql);

LogParser in one of those hidden gems Microsoft has and most people don't know about. LogParser是Microsoft拥有的那些隐藏的宝石之一,而大多数人都不知道。 I have use to read iis logs, CSV files, txt files, etc. You can even generate graphics!!! 我曾经阅读过iis日志,CSV文件,txt文件等。您甚至可以生成图形!

Just check it here http://support.microsoft.com/kb/910447/en 只需在这里检查http://support.microsoft.com/kb/910447/en

Looks like you need to create a Tokenizer. 看起来您需要创建一个分词器。 Try something like this: 尝试这样的事情:

Define a list of token values: 定义令牌值列表:

List<string> gTkList = new List<string>() {"Date:","Source Path:" }; //...etc.

Create a Token class: 创建一个令牌类:

public class Token
{
  private readonly string _tokenText;
  private string _val;
  private int _begin, _end;

  public Token(string tk, int beg, int end)
  {
   this._tokenText = tk;
   this._begin = beg;
   this._end = end;
   this._val = String.Empty;
  }

  public string TokenText
  {
   get{ return _tokenText; }
  }

  public string Value
  {
   get { return _val; }
   set { _val = value; }
  }

  public int IdxBegin
  {
   get { return _begin; }
  }

  public int IdxEnd
  {
   get { return _end; }
  }
}

Create a method to Find your Tokens: 创建一种查找令牌的方法:

List<Token> FindTokens(string str)
{
 List<Token> retVal = new List<Token>();
 if (!String.IsNullOrWhitespace(str))
 {
  foreach(string cd in gTkList)
  {
    int fIdx = str.IndexOf(cd);
    if(fIdx > -1)
       retVal.Add(cd,fIdx,fIdx + cd.Length);
  }      
 }
 return retVal;
}

Then just do something like this: 然后只需执行以下操作:

foreach(string ln in lines)
{
 //returns ordered list of tokens
 var tkns = FindTokens(ln);
 for(int i=0; i < tkns.Length; i++)
 {
  int len = (i == tkns.Length - 1) ? ln.Length - tkns[i].IdxEnd : tkns[i+1].IdxBegin - tkns[i].IdxEnd;
  tkns[i].value = ln.Substring(tkns[i].IdxEnd+1,len).Trim();
 }

 //Do something with the gathered values
 foreach(Token tk in tkns)
 {
  //stuff
 }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM