简体   繁体   English

Java Applet-删除/忽略所有cookie(JSoup)

[英]Java applet - delete/ignore all cookies (JSoup)

I've written a Java applet which gets HTML content from multiple pages from a single host and extracts data from it. 我编写了一个Java小程序,它可以从单个主机的多个页面中获取HTML内容,并从中提取数据。 I use Jsoup and it's working perfectly, but it automatically uses cookies for that host set in the browser and sends newly set cookies on subsequent requests. 我使用的是Jsoup,它运行良好,但是它会自动在浏览器中对该主机集使用cookie,并在后续请求中发送新设置的cookie。 (I believe this is done natively by Java) (我相信这是由Java本地完成的)

I want it to ignore all cookies set by the server when the applet is run and ignore any cookies that the browser may already have. 我希望它在运行小程序时忽略服务器设置的所有cookie,并忽略浏览器可能已经具有的所有cookie。

My code is very simple. 我的代码很简单。

String url = "http://example.com/my/web-page.html";
Document document = Jsoup.connect(url).userAgent("<hard-coded static value>").get();
// Extract data from document with org.Jsoup.nodes.Document.select(), etc.

This repeats with multiple URLs, all having the same host (example.com). 重复多个URL,每个URL具有相同的主机(example.com)。

In summary, I basically want it to: 总而言之,我基本上希望它:

  1. Ignore any cookies for example.com that might be set in the browser. 忽略example.com中可能在浏览器中设置的任何cookie。
  2. If the server sets any new cookies when the applet makes a request, ignore it for subsequent requests. 如果在applet发出请求时服务器设置了任何新的cookie,则对于后续请求将忽略它。 If possible, also block the cookie from being stored in the browser. 如果可能,也阻止cookie被存储在浏览器中。

I've searched a lot and haven't been able to find a solution. 我已经搜索了很多,却找不到解决方案。 I'd really appreciate any amount of help. 非常感谢您的帮助。 I don't mind using Apache HTTPClient or any other third-party library, but I'd prefer not to so I can keep the applet's file size small. 我不介意使用Apache HTTPClient或任何其他第三方库,但我不想这样做,这样可以使applet的文件大小保持较小。

Thanks a ton in advance :) 在此先感谢一吨:)

You should manipulate org.jsoup.Connection.Request for this: 您应该为此操作org.jsoup.Connection.Request

    String url = "http://example.com/my/web-page.html";
    Connection con = Jsoup.connect(url).userAgent("<hard-coded static value>");
    ...
    con.get();
    ...
    Request request = con.request();
    Map<String, String> cookies = request.cookies();
    for(String cookieName : cookies.keySet()) {
        //filter cookies you want to stay in map
        request.removeCookie(cookieName);
    }

You should disable also followRedirects and do redirects manually (removing cookies). 您还应该禁用followRedirects并手动进行重定向(删除cookie)。 You will have to implement your own "Cookie/Domain remover". 您将必须实现自己的“ Cookie /域删除器”。

JSoup uses internally java.net.HttpURLConnection and you can't intercept somehow the core functionality of actually invoking execute method on org.jsoup.helper.HttpConnection.Response.execute(...) because its static and has package protected access. JSoup内部使用java.net.HttpURLConnection ,因此您无法以某种方式截获org.jsoup.helper.HttpConnection.Response.execute(...)上实际调用execute方法的核心功能,因为它是静态的并且具有程序包保护的访问。 Also you can't set req (request private object) and res (response private object) in HttpConnection . 同样,您不能在HttpConnection设置req (请求专用对象)和res (响应专用对象)。 Moreover you can't implement your own org.jsoup.Connection (or extends its implementation HttpConnection because of private constructor) add force JSoup to use that. 而且,您无法实现自己的org.jsoup.Connection (或由于private构造函数而扩展了其实现HttpConnection ),因此强制JSoup使用它。

Considering all above I advice - use HttpClient / HtmlUnit - because you'll eventually end up with "reinventing the wheel" in restricted environment. 考虑到以上所有问题,我建议-使用HttpClient / HtmlUnit-因为您最终将在受限环境中最终“重新发明轮子”。

Instead of using Connection (The resulting return from Jsoup.connect("url"); method), use Response 而不是使用Connection (从Jsoup.connect("url");方法返回的结果),请使用Response

Map<String, String> cookies = new HashMah<String, String>();

Response res = Jsoup
    .connect("url")
    .cookies(cookies)
    .userAgent("userAgent")
    .method(Method.GET) //Or whatever method needed be
    .execute();

I know it is a huge line, but that'll work fine. 我知道这是一条很大的路线,但那会很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM