简体   繁体   English

我如何使用Java解析字符串以获取特定信息?

[英]How do i parse a string to get specific information using java?

Here are some lines from a file and I'm not sure how to parse it to extract 4 pieces of information. 这是文件中的几行,我不确定如何解析该文件以提取4条信息。

11::American President, The (1995)::Comedy|Drama|Romance
12::Dracula: Dead and Loving It (1995)::Comedy|Horror
13::Balto (1995)::Animation|Children's
14::Nixon (1995)::Drama

I would like to get the number, title, release date and genre. 我想获取编号,标题,发行日期和类型。 Genre has multiple genres so I would like to save each one in a variable as well. 流派有多种流派,所以我也想将每个流派保存在一个变量中。

I'm using the .split("::|\\\\|"); 我正在使用.split("::|\\\\|"); method to parse it but I'm not able to parse out the release date. 解析它的方法,但我无法解析发布日期。

Can anyone help me! 谁能帮我!

The easiest would be matching by regex, something like this 最简单的是通过正则表达式进行匹配,像这样

  String x = "11::Title (2016)::Category";
  Pattern p = Pattern.compile("^([0-9]+)::([a-zA-Z ]+)\\(([0-9]{4})\\)::([a-zA-Z]+)$");
  Matcher m = p.matcher(x);
  if (m.find()) {
    System.out.println("Number: " + m.group(1) + " Title: " + m.group(2) + " Year: " + m.group(3) + " Categories: " + m.group(4));
  }

(please don't nail me on the exact syntax, just out of my head) (请不要把我的确切语法钉在我头上)

Then first capture will be the number, the second will be the name, the third is the year and the fourth is the set of categories, which you may then split by '|'. 然后,第一个捕获将是数字,第二个将捕获名称,第三个是年份,第四个是类别集,然后可以将其除以'|'。

You may need to adjust the valid characters for title and categories, but you should get the idea. 您可能需要调整标题和类别的有效字符,但是您应该了解一下。

If you have multiple lines, split them into an ArrayList first and treat each one separately in a loop. 如果有多行,请先将它们拆分为ArrayList,然后在循环中分别对待每一行。

Try this 尝试这个

String[] s =  {
    "11::American President, The (1995)::Comedy|Drama|Romance",
    "12::Dracula: Dead and Loving It (1995)::Comedy|Horror",
    "13::Balto (1995)::Animation|Children's",
    "14::Nixon (1995)::Drama",
};
for (String e : s) {
    String[] infos = e.split("::|\\s*\\(|\\)::");
    String number = infos[0];
    String title = infos[1];
    String releaseDate = infos[2];
    String[] genres = infos[3].split("\\|");
    System.out.printf("number=%s title=%s releaseDate=%s genres=%s%n",
          number, title, releaseDate, Arrays.toString(genres));
}

output 输出

number=11 title=American President, The releaseDate=1995 genres=[Comedy, Drama, Romance]
number=12 title=Dracula: Dead and Loving It releaseDate=1995 genres=[Comedy, Horror]
number=13 title=Balto releaseDate=1995 genres=[Animation, Children's]
number=14 title=Nixon releaseDate=1995 genres=[Drama]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM