Jsoup Java抓取報價標志

Question

我了解，使用此代碼抓取標題會抓取標題“ Google Inc（GOOG）” http://finance.yahoo.com/q?s=goog ：

    String name = doc.select(".title h2").first().text();

我想知道如何分別將標題和股票代碼“ Google Inc”和“ GOOG”刮掉：

雅虎金融股票代號

Answer 1

（1） 我必須刮掉解決方案 ：

這是一個簡短的答案，其中不包含異常處理行，但是，它簡短且可以立即使用。

public static void main(String[] args) throws IOException {
            // collect the html and create the doc
    String url = "http://finance.yahoo.com/q?s=goog";
    Document doc = Jsoup.connect(url).get();

            // locate the header, title and then found the h2 tag
    Element header = doc.select("div[id=yfi_rt_quote_summary]").get(0);
    Element title = header.select("div[class=title]").get(0);
    String h2 = title.select("h2").get(0).text();

            // split by open parenthesis (double escape) and strip off the close parenthesis
            // TODO - regular expression help handle situation where exist multiple "()"s
    String[] parts = h2.split("\\(");
    String name = parts[0];
    String shortname = parts[1].replace(")", "");
    System.out.println(name);
    System.out.println(shortname);

}

輸出看起來像這樣：

Google Inc. 
GOOG

（2） 我不必刮擦解決方案：

這是一篇非常不錯的文章，向您展示了如何以編程方式下載Yahoo數據。

我也是R用戶，在R中獲取Yahoo財務數據非常容易。您可以在那里進行分析，並根據需要將其保存到文件或數據庫中。 :)

Answer 2

您要抓取ID：“ yfs_184_goog”，“ yfs_c63_goog”和“ yfs_p43_goog”。

這些是黑色的大數字，旁邊的小紅/綠數字和百分比。

具有ID的元素的Jsoup的“屏幕抓取”

Jsoup Java抓取報價標志

問題描述

2 個解決方案

解決方案1
2 已采納 2013-11-28 15:34:34

解決方案2
1 2013-11-28 10:33:28

Jsoup Java抓取報價標志

問題描述

2 個解決方案

解決方案1 2 已采納 2013-11-28 15:34:34

解決方案2 1 2013-11-28 10:33:28

解決方案1
2 已采納 2013-11-28 15:34:34

解決方案2
1 2013-11-28 10:33:28