简体   繁体   中英

HTML to RTF string conversion in VB.NET

I have a jquery Rich Textbox editor ( http://jqueryte.com/ ) which allows the end user to fill in their content to generate a word document report.

The process is as following:

User fills in the content --> Saving the HTML of the Richtext box content into the database. --> Pull the stored HTML content from the database and convert it into an RTF string for it to be opened in Microsoft Word.

I tried converting the HTML to an RTF ( a function that would take in my HTML string and give out the equivalent RTF string) but it was getting too complicated to manipulate all the HTML tags. I searched a lot but couldn't find any solution (except not a free one at least) for the problem.

Any help would be greatly appreciated.

Thanks in Advance.

One option is to use the MS Office Word Interop (although frowned upon by some)...

string html = "<html><head><style>p{margin:0}</style></head><body style=\"font-family:Arial;\">" + value.Replace("<p>&nbsp;</p>", "<p><br></p>") + "</body></html>";

byte[] htmlBytes = Encoding.UTF8.GetBytes(html);

string htmlPath = Path.GetTempFileName() + ".html";
string rtfPath = htmlPath.Replace(".html", ".rtf");

FileStream fs = new FileStream(htmlPath, FileMode.Create, FileAccess.Write);
fs.Write(htmlBytes, 0, htmlBytes.Length);
fs.Close();

Application word = new Application();
Document doc = word.Documents.Open(htmlPath);
doc.SaveAs(rtfPath, WdSaveFormat.wdFormatRTF);
doc.Close();
word.Quit();

fs = new FileStream(rtfPath, FileMode.Open, FileAccess.Read);
byte[] rtfBytes = new byte[fs.Length];
fs.Read(rtfBytes, 0, rtfBytes.Length);
fs.Close();

string rtf = Encoding.ASCII.GetString(rtfBytes);

Thread thread = new Thread(() =>
{
    RichTextBox rtb = new RichTextBox();

    rtb.Rtf = rtf;

    rtf = rtb.Rtf;
});
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
thread.Join();
return rtf;

The reason I have used a RichTextBox as well as Word is that Word's RTF files are extremely bulky... when Word's RTF data is fed into a RichTextBox it filters out all the unneeded code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM