简体   繁体   English

有谁知道如何解析这样的文本文件?

[英]does anyone know how to parse text file like this?

I am developing Android app using JAVA. 我正在使用JAVA开发Android应用。

I have a text file like this which is on the internet 我有一个这样的文本文件在互联网上

http://www2.compute.com.tw/~yflin/music.txt http://www2.compute.com.tw/~yflin/music.txt

and it shows like this 它显示像这样

Name="testMusic1" Uri="http://www2.compute.com.tw/test1.mp3" Author="chali" Length=125 Icon="xxx" Lang="台语"
Name="testMusic2" Uri="http://www2.compute.com.tw/test2.mp3" Author="yflin" Length=123 Icon="xxx" Lang="国语"
Name="testMusic3" Uri="http://www2.compute.com.tw/test3.mp3" Author="kkj" Length=132 Icon="xxx" Lang="英文"

but as far as I know, I know how to parse .csv file and I also know how to parse xml file, using XPath expressing, 但据我所知,我知道如何解析.csv文件,也知道如何使用XPath表达解析xml文件,

but! 但! is there any easier way to parse text file like this/?? 有没有更简单的方法来解析这样的文本文件? is there any good API I can use to parse text file like this??? 有什么好的API可以用来解析这样的文本文件??? or using the JAVA Scanner and useDelimiter ??? 或使用JAVA Scanner和useDelimiter ???

is there any examples written in JAVA?? 有没有用JAVA编写的示例? 'cause I really cannot look up any more... Have been already search/survey for a long long time. 因为我真的再也无法查找了。。。已经很长时间了。 Can someone help me?? 有人能帮我吗??

Here is a complete tested working program. 这是一个完整的经过测试的工作程序。 Just put this file where your music.txt is and run it. 只需将此文件放在您的music.txt所在的位置并运行它即可。 Also it uses the Scanner class. 它还使用Scanner类。

import java.io.*;
import java.util.*;

public class Program {

    public static void main(String[] args) throws Exception{
        FileReader file = new FileReader("music.txt");
        Scanner scanner = new Scanner(file);
        while(scanner.hasNextLine()){
            String line = scanner.nextLine().trim();
            String[] tokens = line.split("\\s+");
            for(int i = 0; i < tokens.length; i++){
                String[] elements = tokens[i].split("=");
                System.out.println(elements[0] + ": " + elements[1]);
            }
            System.out.println();
        }
        scanner.close();
    }
}

Sample Output 样本输出

Name: "testMusic1"
Uri: "http://www2.compute.com.tw/test1.mp3"
Author: "chali"
Length: 125
Icon: "xxx"
Lang: "test1"

Name: "testMusic2"
Uri: "http://www2.compute.com.tw/test2.mp3"
Author: "yflin"
Length: 123
Icon: "xxx"
Lang: "test2"

Name: "testMusic3"
Uri: "http://www2.compute.com.tw/test3.mp3"
Author: "kkj"
Length: 132
Icon: "xxx"
Lang: "test3"

VVV VVV

One possibility is String.split(). 一种可能性是String.split()。

EXAMPLE: 例:

public class Split {

  static final String s = 
    "Name=\"testMusic1\"   Uri=\"http://www2.compute.com.tw/test1.mp3\"   Author=\"chali\"";

  public static void main (String[] args) {
    String[] nameValuePairs = s.split("\\s+"); // Split each item in string on whitespace (\s+)
    for (int i=0; i < nameValuePairs.length; i++) {
      String[] nv = nameValuePairs[i].split("=");  // Split the resulting name/value pair on "="
      System.out.println ("pair[" + i + "]: (" + nameValuePairs[i] + "), name=" + nv[0] + ", value=" + nv[1] + ".");
    }
  }

}

SAMPLE OUTPUT: 样品输出:

pair[0]: (Name="testMusic1"), name="Name, value="testMusic1".
pair[1]: (Uri="http://www2.compute.com.tw/test1.mp3"), name=Uri, value="http://www2.compute.com.tw/test1.mp3".
pair[2]: (Author="chali"), name=Author, value="chali".

The problem here is that you have not provided a proper syntax for your input, and designing a parser based examples is ... dodgy. 这里的问题是您没有为输入提供正确的语法,并且设计基于解析器的示例很容易。

For example, if we assume that all examples will look like that, then String.split("\\\\s+") followed by String.split("=") will do the job. 例如,如果我们假设所有示例都类似,那么String.split("\\\\s+")String.split("=")完成此任务。 Or you could do the same thing using Scanner . 或者,您也可以使用Scanner做同样的事情。

But if other examples (that you haven't seen yet) were a bit different, this may not work. 但是,如果其他示例(您尚未看到)有所不同,则可能无法正常工作。 For instance: 例如:

  • the key/value pairs might be in a different order, 键/值对的顺序可能不同,
  • the key names might have different casing, 键名可能使用不同的大小写,
  • the key names might contain spaces, 键名可能包含空格,
  • the values might contain spaces, 值可能包含空格,
  • there might be some escaping convention in the quoted strings to deal with embedded quote characters, etcetera, 在带引号的字符串中可能会有一些转义约定,以处理嵌入式引号字符等。
  • there might be no space between the end-quote of a value and the next key name, 值的结尾引号和下一个键名之间可能没有空格,

On short, it is doubtful that there is sufficient information in those examples for you (or anyone) to implement an parser that will work for all inputs. 简而言之,令人怀疑的是,在这些示例中是否有足够的信息供您(或任何人)使用,以实现适用于所有输入的解析器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM