從使用Java從Wolfram Alpha檢索的結果中提取所需的子字符串

Question

我正在開發一個Java程序，它從用戶那里接收一個問題，將它發送到Wolfram Alpha API，然后清理結果並打印出來。

如果用戶問“誰是美國總統？” 結果如下

Response: <section><title>Input interpretation</title>    <sectioncontents>United States | President</sectioncontents></section><section><title>Result</title><sectioncontents>Barack Obama  (from 20/01/2009  to  present)</sectioncontents></section><section><title>Basic information</title><sectioncontents>official position | President (44th)..........etc

我想提取“巴拉克奧巴馬（從2009年1月20日到現在）”

我已經能夠使用以下代碼修剪到Barack：

String clean =response.substring(response.indexOf("Result") + 31 , response.length());
    System.out.println("Response: " + clean);

我如何修剪結果的其余部分？

Answer 1

響應本質上是XML。

正如在許多編程論壇中無休止地討論的那樣，正則表達式不適合解析XML - 您應該使用XML解析器。

Answer 2

好吧，如果有幫助，我想出了這個正則表達式：

Result.+?>([^<]+?)<

找到“結果”后，它會捕獲>和<的第一個實例，並且它們之間至少有一個字符。

更新以下是一些可能有用的示例代碼：

String response = "Response: <section><title>..."
Pattern pattern = Pattern.compile("Result.+?>([^<]+?)<");
Matcher match = pattern.matcher(response);
String clean = "";
if (match.find())
    clean = match.group(1);
System.out.println(clean);

從使用Java從Wolfram Alpha檢索的結果中提取所需的子字符串

問題描述

2 個解決方案

解決方案1
0 2016-11-24 22:56:25

解決方案2
0 已采納 2016-11-25 00:32:12

從使用Java從Wolfram Alpha檢索的結果中提取所需的子字符串

問題描述

2 個解決方案

解決方案1 0 2016-11-24 22:56:25

解決方案2 0 已采納 2016-11-25 00:32:12

解決方案1
0 2016-11-24 22:56:25

解決方案2
0 已采納 2016-11-25 00:32:12