简体   繁体   English

使用Android从网络上获取数据?

[英]Getting data from the web using Android?

When using Eclipse for Java, I'm able to access data from websites and fill out online forms using Selenium. 使用Eclipse for Java时,我可以从网站访问数据并使用Selenium填写在线表单。 All I have to do is do WebDriver driver = new HtmlUnitDriver(); 我要做的就是做WebDriver driver = new HtmlUnitDriver(); and driver.get("wwww.google.com"); driver.get("wwww.google.com"); and driver.findElement() . driver.findElement() In order to accomplish this, I would go into the Java Build Path, access Libraries, and then add the external JAR file: selenium-server-standalone-2.39.0.jar . 为了完成此任务,我将进入Java Build Path,访问Libraries,然后添加外部JAR文件: selenium-server-standalone-2.39.0.jar

I'd like to do the same for Android but am having difficulty. 我想在Android上也做同样的事情,但是遇到了困难。 I understand there was something called Selenium for Android, but it's no longer being supported. 我了解有一种叫做Selenium的Android版,但不再受支持。 Now there's Selendroid. 现在有Selendroid。 But while the code is vaguely familiar to that of Eclipse for Java (ie, SelendroidCapabilities capa = new SelendroidCapabilities("io.selendroid.testapp:0.12.0"); , WebDriver driver = new SelendroidDriver(capa); , WebElement inputField = driver.findElement(By.id("my_text_field")); ), I don't think this is actually the same as what I am looking for. 但是,尽管代码对于Eclipse for Java来说是含糊的(例如, SelendroidCapabilities capa = new SelendroidCapabilities("io.selendroid.testapp:0.12.0");WebDriver driver = new SelendroidDriver(capa); WebElement inputField = driver.findElement(By.id("my_text_field")); ),我认为这实际上与我要查找的内容不一样。 I even tried to add selendroid-standalone-0.12.0-with-dependencies.jar to the Android library and all I got back was this error in the console: 我什至尝试将selendroid-standalone-0.12.0-with-dependencies.jar到Android库中,而我得到的只是控制台中的此错误:

Dx warning: Ignoring InnerClasses attribute for an anonymous inner class
(org.apache.xalan.lib.sql.SecuritySupport12$8) that doesn't come with an
associated EnclosingMethod attribute. This class was probably produced by a
compiler that did not target the modern .class file format. The recommended
solution is to recompile the class from source, using an up-to-date compiler
and without specifying any "-target" type options. The consequence of ignoring
this warning is that reflective operations on this class will incorrectly
indicate that it is *not* an inner class.

So my question is: Where can I go to learn about using Android to go to a web page and retrieve some data (but not actually open a web page on the screen, this is strictly background stuff)? 所以我的问题是:在哪里可以学习使用Android转到网页并检索一些数据(但实际上并没有在屏幕上打开网页,这严格是背景知识)? Or, what are the steps to getting data from a website via Android using identifiers such as id , name , or Xpath , etc.? 或者,使用idnameXpath等标识符通过Android从网站获取数据的步骤是什么?

Use JSOUP for the same. 同样使用JSOUP I think thats what you loking for. 我认为这就是您想要的。

jsoup is a Java library for working with real-world HTML. jsoup是一个用于处理实际HTML的Java库。 It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. 它提供了使用DOM,CSS和类似jquery的最好方法提取和处理数据的非常方便的API。

Download jar and include in project. 下载jar并包含在项目中。

Simple example : 简单的例子:

Document doc = Jsoup.connect("http://example.com/").get();
String title = doc.title();

Read apidocs for more info. 阅读apidocs了解更多信息。

Also make sure to put network calls in an AsyncTask and not on main UI thread. 还要确保将网络调用放在AsyncTask中,而不放在主UI线程上。

I eventually found something that is exactly what I wanted: HtmlCleaner. 我最终找到了我想要的东西:HtmlCleaner。 There's a good guide here . 有一个很好的指导这里

Download the JAR file here and include it in the project's library. 此处下载JAR文件并将其包含在项目的库中。

Then use the following code to get your element from the XPath: 然后使用以下代码从XPath获取元素:

public class Main extends Activity {

    // HTML page
    static final String URL = "https://www.yourpage.com/";
    // XPath query
    static final String XPATH = "//some/path/here";

    @Override
    public void onCreate(Bundle savedInstanceState) {
        // init view layout
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        // decide output
        String value = getData();
    }

    public String getData() {
        String data = "";

        // config cleaner properties
        HtmlCleaner htmlCleaner = new HtmlCleaner();
        CleanerProperties props = htmlCleaner.getProperties();
        props.setAllowHtmlInsideAttributes(false);
        props.setAllowMultiWordAttributes(true);
        props.setRecognizeUnicodeChars(true);
        props.setOmitComments(true);

        // create URL object
        URL url = new URL(URL);
        // get HTML page root node
        TagNode root = htmlCleaner.clean(url);

        // query XPath
        Object[] statsNode = root.evaluateXPath(XPATH);
        // process data if found any node
        if(statsNode.length > 0) {
            // I already know there's only one node, so pick index at 0.
            TagNode resultNode = (TagNode)statsNode[0];
            // get text data from HTML node
            stats = resultNode.getText().toString();
        }

        // return value
        return data;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM