简体   繁体   中英

Java Servlet as a HTTP Proxy

I have read hundreds of SO Posts and studied several Java HTTP-Proxy Sources available... but I could not find a solution for my Problem.

I wrote a WebApp that proxies Http-Requests. The WebApp is working, but links and referrers become broken because the "Root" of the proxied page points to the root of my server and not to the path of my proxyservlet..

To make it more clear:

  1. My ProxyServlet gets a Request " http://myserver.com/proxy/ProxyServlet?foo=bar "

  2. The ProxyServlet now fetches the pagecontent from ServerX (eg " http://original.com/test.html ")

  3. The content of the page is delivered to the browser by just reading and writing from one stream to the other and copying the headers.

  4. The browser displays the page, the URL, that the browser shows is the original request (" http://myserver.com/proxy/ProxyServlet?foo=bar "), but all relative links now point to " http://myserver.com/XXX.html " instead of " http://myserver.com/proxy/ProxyServlet/XXX.html "

Is there a response-header where I can change the "path" so that relative links correctly point to my ProxyServlet?

(Rewriting the page-content and replacing links would be too difficult, because the page contains relatively addressed elements such as javascript code and other active content...)

(Changing the mapping for my Servlet to "/*" is also not possible... it must be accessed via this path...)

You are inventing a "reverse proxy", and miss the "URL rewriting" feature... Off the top of my search results, here's an open source proxy servlet that does this: http://j2ep.sourceforge.net/docs/rewrite.html

Also you should know there is probably something wrong with the system architecture if you have to do this. Dropping in a standalone proxy like Apache, nginex, Varnish should always be an option, as you will HAVE to add one (or more!) as you start scaling.

It sounds like the page you're proxying in is using absolute links, eg <a href="/XXX.html"> which means "no matter where this link is found, look for it relative to the document root". If you have control of it, the best thing is for the proxy target to be more lenient in it's linking, and instead use <a href="XXX.html"> . If you can't do that, then you need to re-write these URLs, some example code, using JSoup:

Document doc = Jsoup.parse(rawBody, getDisplayUrl());

for(Element cssALink : doc.select("link[rel=stylesheet],a[href]"))
{
    cssALink.attr("href", cssALink.absUrl("href"));
}
for(Element imgJsLink : doc.select("script[src],img[src]"))
{
    imgJsLink.attr("src", imgJsLink.absUrl("src"));
}
return doc.toString();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM