如何將html轉換為純文本c#？

Question

我正在嘗試從 html 網站獲取純文本，但我正在獲取 html 代碼而不是純文本。例如 <b>你好</b><p>它的我</p>我如何將它轉換為你好它的我。 很感謝任何形式的幫助！ 這是我的代碼。

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
 using System.Net;
 using System.Text.RegularExpressions;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;

 namespace WindowsFormsApplication2
 {
   public partial class Form1 : Form
   {
    public Form1()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {

        HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(""https://www.dailyfx.com/real-time-news");
        myRequest.Method = "GET";
        WebResponse myResponse = myRequest.GetResponse();
        StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8);
        string result = sr.ReadToEnd();




        textBox1.Text = result;
        sr.Close();
        myResponse.Close();
    }
    }
}

Answer 1

 You can use regex expressions for this. 

 Regex.Replace(htmltext, "<.*?>", string.Empty);

 Eg:- String htmltext = "string html = "<p>Test1 <b>.NET</b> Test2 Test3 
                         <i>HTML</i> Test4.</p>";"
      Output will be :- Test1 Test2 Test3 Test4.

這會對你有所幫助。 http://www.codeproject.com/Tips/136704/Remove-all-the-HTML-tags-and-display-a-plain-text

Answer 2

簡短回答：沒有直接轉換； 你正在“屏幕抓取”一個網站； 解析結果字符串以提取您需要的內容（或者更好的是，查看相關網站是否提供了 API）。

網站以 HTML 呈現，而不是純文本。 盡管您將結果作為字符串返回，但您需要對其進行解析以提取您感興趣的文本。實際提取在很大程度上取決於您要完成的任務。 如果網站是正確的 XHTML，您可以將其作為 XML 加載到XDocument中並遍歷樹以獲取您需要的信息； 否則，其中一條評論中建議的HTMLAgilityPack可能會有所幫助（不像評論所暗示的那么神奇——它比GetString多一點工作......）

如何將html轉換為純文本c#？

問題描述

2 個解決方案

解決方案1
1 2016-10-13 07:56:58

解決方案2
0 2016-10-13 07:35:30

如何將html轉換為純文本c#？

問題描述

2 個解決方案

解決方案1 1 2016-10-13 07:56:58

解決方案2 0 2016-10-13 07:35:30

解決方案1
1 2016-10-13 07:56:58

解決方案2
0 2016-10-13 07:35:30