簡體   English   中英

如何通過Java代碼獲取替換/重定向的URL

[英]How to get replaced/redirected URL by java code

解析網頁時,我得到鏈接href = http://www.onvista.de/aktien/snapshot.html?ID_OSI=36714349在瀏覽器中發布此鏈接時,它將替換為“ http://www.onvista .de / aktien / Adidas-Aktie-DE000A1EWWW0 ”並正確呈現。 但是用Java我無法檢索頁面。 我使用了下面的示例,在這里建議顯示重定向的URL。

import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;

public class GetRedirected {

    public GetRedirected() throws MalformedURLException, IOException {
        String url="http://www.onvista.de/aktien/snapshot.html?ID_OSI=36714349";
        URLConnection con = new URL( url ).openConnection();
        System.out.println( "orignal url: " + con.getURL() );
        con.connect();
        System.out.println( "connected url: " + con.getURL() );
        InputStream is = con.getInputStream();
        System.out.println( "redirected url: " + con.getURL() );
        is.close();
    }

    public static void main(String[] args) throws Exception {
        new GetRedirected();
    }
}

但是,它在帶有附加錯誤消息的“ InputStream is = -statement”處失敗。 我該如何解決。 任何想法都歡迎。

原始網址:www.onvista.de/aktien/snapshot.html?ID_OSI = 36714349

連結網址:www.onvista.de/aktien/snapshot.html?ID_OSI=36714349

線程“主”中的異常java.io.IOException:服務器返回了HTTP

響應代碼:403表示URL:www.onvista.de/aktien/snapshot.html?ID_OSI=36714349

在sun.net.www.protocol.http.HttpURLConnection.getInputStream(未知源)

在de.gombers.broker ....

常見錯誤:當HttpURLConnection響應的HTTP狀態代碼指示錯誤(AFAIK> = 400)時,訪問getInputStream()會引發異常。 您必須檢查getResponseCode() ,然后決定是否必須調用getInputStream()getErrorStream() 因此,應該先調用getResponseCode()而不是調用getInputStream() getResponseCode()

但是實際上我無法重現您的錯誤,對我來說,它是有效的(盡管我使用了一個名為DavidWebb的小型抽象庫:

public void testAktienAdidas() throws Exception {

    Webb webb = Webb.create();
    Response<String> response = webb
            .get("http://www.onvista.de/aktien/snapshot.html?ID_OSI=36714349")
            .asString();

    assertEquals(200, response.getStatusCode());
    assertNotNull(response.getBody());
    assertTrue(response.getBody().contains("<!DOCTYPE html>"));
}

我沒有重定向,可能是通過JavaScript在客戶端完成的,或者有一些服務器端邏輯可以評估HTTP標頭,例如User-Agent

但是,如果遇到重定向,則可以告訴HttpURLConnection自動跟隨它們

conn.setInstanceFollowRedirects(true);
you can get retrieve it by this code
package Test;

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;

public class HttpRedirectExample {

  public static void main(String[] args) {

    try {

    String url = "http://www.onvista.de/aktien/snapshot.html?ID_OSI=36714349";
//  String urlTest="https://api.twitter.com/oauth/authenticate";

URL obj = new URL(url);
    HttpURLConnection conn = (HttpURLConnection) obj.openConnection();
    conn.setReadTimeout(5000);
    conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
    conn.addRequestProperty("User-Agent", "Mozilla");
    conn.addRequestProperty("Referer", "google.com");

    System.out.println("Request URL ... " + url);

    boolean redirect = false;


    int status = conn.getResponseCode();
    if (status != HttpURLConnection.HTTP_OK) {
        if (status == HttpURLConnection.HTTP_MOVED_TEMP
            || status == HttpURLConnection.HTTP_MOVED_PERM
                || status == HttpURLConnection.HTTP_SEE_OTHER)
        redirect = true;
    }

    System.out.println("Response Code ... " + status);

    if (redirect) {

        // get redirect url from "location" header field
        String newUrl = conn.getHeaderField("Location");

        // get the cookie if need, for login
        String cookies = conn.getHeaderField("Set-Cookie");

        // open the new connnection again
        conn = (HttpURLConnection) new URL(newUrl).openConnection();
        conn.setRequestProperty("Cookie", cookies);
        conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
        conn.addRequestProperty("User-Agent", "Mozilla");
        conn.addRequestProperty("Referer", "google.com");

        System.out.println("Redirect to URL : " + newUrl);

    }

    BufferedReader in = new BufferedReader(
                              new InputStreamReader(conn.getInputStream()));
    String inputLine;
    StringBuffer html = new StringBuffer();

    while ((inputLine = in.readLine()) != null) {
        html.append(inputLine);
    }
    in.close();

    System.out.println("URL Content... \n" + html.toString());
    System.out.println("Done");

    } catch (Exception e) {
    e.printStackTrace();
    }

  }

}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM