简体   繁体   English

用Java解析文本文件

[英]Parsing a text file in Java

Example from input file: 输入文件中的示例:

ARTIST="unknown"
TITLE="Rockabye Baby"
LYRICS="Rockabye baby in the treetops
When the wind blows your cradle will rock
When the bow breaks your cradle will fall
Down will come baby cradle and all
"

The Artist, Title & Lyrics fields have to be extracted to their respective Strings with captalization and format unchanged. 必须将“艺术家”,“标题和歌词”字段提取到它们各自的字符串,且大小写不变且格式不变。 This code for the Artist field 此代码用于Artist字段

while (readIn.hasNext()) {
    readToken= readIn.next();
    if (readToken.contains("ARTIST")) {
        artist= readIn.next();
        }
        if (readToken.contains("TITLE")){
            title= readIn.next();
    System.out.print(artist+" "+title);
}

ends up printing this out: 最终将其打印出来:

"unknown"
TITLE null"unknown"
TITLE null"unknown"
TITLE

From the code, I can't see why the output is printing this way. 从代码中,我看不到为什么输出以这种方式打印。 For every time through the loop, the readToken string gets refreshed, and should then be compared by the contains() method. 对于每次循环,都会刷新readToken字符串,然后应通过contains()方法对其进行比较。 Obviously I'm missing something here. 显然我在这里错过了一些东西。

So, am I close to the right track or am I in a completely different city? 那么,我是在正确的道路上走还是在完全不同的城市中?

From your code of 从你的代码

while (readIn.hasNext()) {
    readToken= readIn.next();
    if (readToken.contains("ARTIST")) {
        artist= readIn.next();
        }
    if (readToken.contains("TITLE")){
        title= readIn.next();
    System.out.print(artist+" "+title);
}

The program, if assuming is declared and instantiated correctly (the variables artist, readToken and title) is firstly checking if there exists an existing next line in the while condition, if that is true, the String (im assuming) readToken is saved as the next line. 如果假设已正确声明并实例化了该程序(变量artist,readToken和title),则该程序首先检查while条件中是否存在下一行,如果为true,则将String(假设输入)readToken保存为下一行。 If readToken contains "ARTIST", then the NEXT line down is saved as the artist string. 如果readToken包含“ ARTIST”,则将下一行向下保存为艺术家字符串。 Likewise for containing "TITLE". 同样包含“ TITLE”。 By the time the while loop repetends, you might have already hit LYRICS, skipping the TITLE altogether, causing it to be NULL. 到while循环重复时,您可能已经打了LYRICS,完全跳过了TITLE,导致它为NULL。

What you want is maybe saving 您想要的就是节省

artist = readToken; artist = readToken; or title = readToken; 或标题= readToken; instead. 代替。

Also don't forget to substring artist and title if you don't want a print of "ARTIST="ARTISTNAMEHERER" TITLE="TITLENAMEHERE"" and rather "ARTISTNAME, TITLENAME" 如果您不想打印“ ARTIST =“ ARTISTNAMEHERER” TITLE =“ TITLENAMEHERE”“而不是” ARTISTNAME,TITLENAME“,也不要忘记为艺术家和标题添加字符串

Further to Alex Hart's answer, I think you might consider using Java's Pattern and Matcher classes and use groups to obtain the matched arguments eg (non tested): 除了Alex Hart的答案,我认为您可以考虑使用Java的Pattern和Matcher类,并使用组来获取匹配的参数,例如(未测试):

private static final Pattern RECORDING_HEADER = 
  new Pattern("(ARTIST=\\"(.*)\\")?(TITLE=\\"(.*)\\")?(LYRICS=\\"(.*)\\")?");

Then when you're reading a line: 然后,当您阅读一行时:

String line = readIn.readLine(); // Presuming that readIn is a BufferedReader
Matcher m = RECORDING_HEADER.matcher(line);

if (m.matches()) {
  final int artistGroup = 2;
  String artist = m.group(artistGroup);

  final int titleGroup = 4;
  String title = m.group(titleGroup);

  final int lyricsGroup = 6;
  String lyrics = m.group(lyricsGroup);

  if (artist != null) {
    // You've got an artist...
  } else if (title != null) {
    // etc...
  }
}

It looks as if what you are doing here is reading in line by line, but when you find what you are looking for, you are setting your variable to the next line. 似乎您在这里所做的工作正在逐行读取,但是当您找到所需的内容时,便会将变量设置为下一行。 This may be causing issues and out of bounds problems, and could very well be a symptom of your mishap 这可能会导致问题出界,也很可能是您不幸事故的征兆

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM