简体   繁体   English

如何在数据库中存储HTML内容

[英]How to Store html content in database

I want to store the temperatures for a year from a weather forecast web site like this one into a database to can use it later in an android application. 我想将这样一个天气预报网站的温度存储到数据库中,以便以后在Android应用程序中使用。 I tried to use Jsoup, but i only get pieces of the table containing temperatures. 我尝试使用Jsoup,但是我只得到包含温度的表。 Is there any way to get that html table content to can store it? 有什么方法可以获取该html表内容以进行存储吗?

表

It would be a whole lot better if you used the API provided by wunderground instead of using jsoup in order to screen scrape the page. 如果您使用wunderground提供的API而不是使用jsoup进行屏幕抓取页面,那就更好了。 The main reasons are that the implementation will be a lot cleaner and also your implementation will be immune to stylistic changes in wunderground web pages. 主要原因是该实现将更加整洁,并且您的实现将不受wunderground网页中样式的更改的影响。 Here is guide on how to consume a REST web service with Spring. 这是有关如何通过Spring使用REST Web服务的指南

Once you have retrieved the data from the API you could easily store the data in a database using an ORM framework like Hibernate since you would have already created the objects to retrieve the data. 从API检索数据后,您可以使用Hibernate之类的ORM框架轻松地将数据存储在数据库中,因为您已经创建了对象来检索数据。 You can make your life even easier if you use Spring with Hibernate integration to save the data. 如果您将Spring与Hibernate集成在一起使用来保存数据,则可以使您的生活更加轻松。 Check out this guide. 查看指南。

The guides mentioned above use Spring Boot to make it extremely easy to get started with the Spring framework (gone are the days where it would be almost impossible for a novice to get started with a Spring project all alone) 上面提到的指南使用Spring Boot来使Spring框架的入门变得非常容易(对于新手来说,几乎一个人几乎不可能一个人开始使用Spring项目的日子已经一去不复返了)

Broadly speaking, the HTML document displayed on the website would have to be parsed programmatically, tokenized, converted to suitable data types and finally stored into the database. 广义上讲,网站上显示的HTML文档必须进行编程解析,标记化,转换为合适的数据类型并最终存储到数据库中。 However it should be checked whether the data on the website could be read via a SOAP webservice or something similar, as the interface would be cleaner and the approach more robust. 但是,应该检查网站上的数据是否可以通过SOAP Web服务或类似的东西读取,因为界面更简洁,方法更可靠。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM