简体   繁体   English

从网页表获取数据

[英]Getting data from a webpage table

I know how to read content from a web page , but I am stuck to extract valid content from the page. 我知道如何从网页中读取内容,但是我仍然无法从页面中提取有效内容。 i want to extract only tables that have some data content. 我只想提取具有某些数据内容的表。 and i have no table id. 而且我没有表ID。 Like i want to get tables from below link...use c sharp 就像我想从下面的链接中获取表格...使用c Sharp

http://www.unece.org/cefact/locode/service/location.html http://www.unece.org/cefact/locode/service/location.html

In general, for parsing out data from HTML, you should use a purpose built HTML parser. 通常,要从HTML解析数据,应使用专用的HTML解析器。

Two good options are: 两个好的选择是:

  • HTML Agility Pack - uses XPath / LINQ to query the parsed HTML HTML Agility Pack-使用XPath / LINQ查询已解析的HTML
  • CsQuery - uses jQuery like selector syntax to query the parsed HTML CsQuery-使用类似jQuery的选择器语法查询已解析的HTML

The choice between the two should boil down to which query syntax you (and your team) are more comfortable with. 两者之间的选择应归结为您(和您的团队)更喜欢哪种查询语法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM