使用C＃从HTML页面提取字符串

Question

I have a source html page and I want to do the following: 我有一个源html页面，我想执行以下操作：

extracting a specific string from the whole html page and save the new choosing string in a new html page. 从整个html页面中提取特定的字符串，并将新选择的字符串保存在新的html页面中。
creating a database on MySQL with 4 columns. 在MySQL上用4列创建数据库。
importing the data from the html page to the table on MySql. 将数据从html页导入到MySql的表中。

I would be pretty thankful and grateful if someone could help me in that cause I have no that perfect knowledge of using C#. 如果有人可以帮助我，我会非常感激和感激，因为我对使用C＃并不了解。

Answer 1

You could use this code : 您可以使用以下代码：

HttpClient http = new HttpClient();

//I have put Ebay.com. you could use any.
var response = await http.GetByteArrayAsync("ebay.com"); 
String source = Encoding.GetEncoding("utf-8").GetString(response, 0, response.Length - 1);
source = WebUtility.HtmlDecode(source);
HtmlDocument Nodes = new HtmlDocument();
Nodes.LoadHtml(source);

In the Nodes object, you will have all the DOM elements in the HTML page . 在Nodes对象中， all the DOM elements in the HTML page中将具有all the DOM elements in the HTML page 。

You could use linq to filter out whatever you need. 您可以使用linq过滤掉所需的内容。

Example : 范例：

List<HtmlNode> RequiredNodes = Nodes.DocumentNode.Descendants()
                                    .Where(x => x.Attributes["Class"].Contains("List-Item")).ToList();

You will probably need to install Html Agility Pack NuGet or download it from the link. 您可能需要安装Html Agility Pack NuGet或从链接中下载它。

hope this helps. 希望这可以帮助。

使用C＃从HTML页面提取字符串

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-03-19 11:53:46

使用C＃从HTML页面提取字符串

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-03-19 11:53:46

解决方案1
1 已采纳 2017-03-19 11:53:46