[英]How to retrieve bold, italic and underlined words from plain text and surround them by HTML tags
What I want to achieve: 我想要实现的目标:
Input: (Input text comes from a Excel cell) 输入:(输入文本来自Excel单元格)
This is a string includes bold , italic and underlined words. 这是一个字符串,包括粗体 , 斜体和带下划线的单词。
Expected output: 预期产量:
This is a <b>string</b> includes <b>bold</b>, <i>italic</i> and <u>underlined</u> words.
What I tried: (This method iterates the plain text by characters not words.) 我尝试了什么:(此方法通过字符而不是单词来迭代纯文本。)
StringBuilder html = new StringBuilder();
StringBuilder fontText = new StringBuilder();
string path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Test.xls");
Application excel = new Application();
Workbook wb = excel.Workbooks.Open(path);
Worksheet excelSheet = wb.ActiveSheet;
//Read the first cell
Range cell = excelSheet.Cells[1, 1];
for (int index = 1; index <= cell.Text.ToString().Length; index++)
{
//cell here is a Range object
Characters ch = cell.get_Characters(index, 1);
bool bold = (bool) ch.Font.Bold;
if(bold){
if (html.Length == 0)
html.Append("<b>");
html.Append(ch.Text);
}
}
if (html.Length !=0) html.Append("</b>")
But this method returns all bold texts surrounded by HTML tags like <b>stringbold</b>
但是此方法返回由HTML标记包围的所有粗体文本,如
<b>stringbold</b>
Expected result is: <b>string</b>
and <b>bold</b>
预期结果为:
<b>string</b>
和<b>bold</b>
Any great thoughts on this? 对此有何好的想法?
Thanks in advance. 提前致谢。
Here's what I would do: 这就是我要做的事情:
Create a helper class that knows about Font styles, and their opening and closing tags, and which can keep track of the "current" font style 创建一个帮助类,它知道字体样式及其开始和结束标记,并且可以跟踪“当前”字体样式
Start out the class with Regular style, and then in you loop, ask the helper class to insert opening and closing tags if the font style has changed before writing the current character 使用Regular样式开始该类,然后在循环中,如果在写入当前字符之前字体样式已更改,请求帮助程序类插入开始和结束标记
At the end of the loop, ask the helper to insert the proper closing tag 在循环结束时,请求助手插入正确的结束标记
I don't have an Excel interop project to play with, so here's a sample, which you may have to adapt to the specific Excel font types. 我没有可以使用的Excel互操作项目,所以这是一个示例,您可能必须适应特定的Excel字体类型。
First, the helper class: 一,助手类:
static class TextHelper
{
// You may have to use a different type than `FontStyle`
// Hopefully ch.Font has some type of `Style` property you can use
public static FontStyle CurrentStyle { get; set; }
public static string OpenTag { get { return GetOpenTag(); } }
public static string CloseTag { get { return GetCloseTag(); } }
// This will return the closing tag for the current font style,
// followed by the opening tag for the new font style
public static string ChangeStyleIfNeeded(FontStyle newStyle)
{
if (newStyle == CurrentStyle) return string.Empty;
var transitionStyleTags = GetCloseTag();
CurrentStyle = newStyle;
transitionStyleTags += GetOpenTag();
return transitionStyleTags;
}
private static string GetOpenTag()
{
switch (CurrentStyle)
{
case FontStyle.Bold:
return "<b>";
case FontStyle.Italic:
return "<i>";
case FontStyle.Underline:
return "<u>";
default:
return "";
}
}
private static string GetCloseTag()
{
switch (CurrentStyle)
{
case FontStyle.Bold:
return "</b>";
case FontStyle.Italic:
return "</i>";
case FontStyle.Underline:
return "</u>";
default:
return "";
}
}
}
Next, the implementation would look something like this: 接下来,实现看起来像这样:
// Start our helper class with 'Regular' font
TextHelper.CurrentStyle = FontStyle.Regular;
var html = new StringBuilder();
for (int index = 1; index <= cell.Text.ToString().Length; index++)
{
char ch = cell.get_Characters(index, 1);
// If the Font of this character is different than the current font,
// this will close the old style and open our new style.
html.Append(TextHelper.ChangeStyleIfNeeded(ch.Font));
// Append this character
html.Append(ch.Text);
}
// Close the style at the very end
html.Append(TextHelper.CloseTag);
It took half of my day to figure out this solution. 我花了一半的时间来弄清楚这个解决方案。
1.The code works with Bold , Italic and underline characters.
1.代码适用于粗体 , 斜体和下划线字符。
2.The algorithm is little bit complicated.
这个算法有点复杂。 If any optimization available or anyone come up with better solution, please post new answer.
如果有任何优化或任何人提出更好的解决方案,请发布新的答案。
ExcelReader
method: ExcelReader
方法:
public string ExcelReader(string excelFilePath)
{
StringBuilder resultText = new StringBuilder();
//string excelFilePath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Test.xls");
Application excel = new Application();
Workbook wb = excel.Workbooks.Open(excelFilePath);
Worksheet excelSheet = wb.ActiveSheet;
//Read the first cell
Range cell = excelSheet.Cells[1, 1];
//Check if one bold or italic WORD.
bool IfStop = false;
//Check if character is the start of bold or italic character.
bool ifFirstSpecialCharacter = true;
//Initialize a empty tag
string tag = "";
//Check if it is the last index
bool isLastIndex = false;
for (int index = 1; index <= cell.Text.ToString().Length; index++)
{
//Check if the current character is bold or italic
bool IfSpecialType = false;
//cell here is a Range object
Characters ch = cell.get_Characters(index, 1);
XlUnderlineStyle temp = (XlUnderlineStyle)ch.Font.Underline;
bool underline = false;
if (temp == XlUnderlineStyle.xlUnderlineStyleSingle)
underline = true;
bool bold = (bool)ch.Font.Bold;
bool italic = (bool)ch.Font.Italic;
if (underline)
{
if (tag != "" && tag != "<u>")
{
resultText.Append(tag.Insert(1, "/"));
ifFirstSpecialCharacter = true;
IfStop = true;
}
tag = "<u>";
IfSpecialType = true;
}
if (bold)
{
if (tag != "" && tag != "<b>")
{
resultText.Append(tag.Insert(1, "/"));
ifFirstSpecialCharacter = true;
IfStop = true;
}
tag = "<b>";
IfSpecialType = true;
}
if (italic)
{
if (tag != "" && tag != "<i>")
{
resultText.Append(tag.Insert(1, "/"));
ifFirstSpecialCharacter = true;
IfStop = true;
}
tag = "<i>";
IfSpecialType = true;
}
if (index == cell.Text.ToString().Length)
isLastIndex = true;
DetectSpecialCharracterByType(isLastIndex, resultText, ref tag, IfSpecialType, ref IfStop, ref ifFirstSpecialCharacter, ch);
}
wb.Close();
return resultText.ToString();
}
DetectSpecialCharacterByType
method: DetectSpecialCharacterByType
方法:
private static void DetectSpecialCharacterByType(bool isLastIndex, StringBuilder fontText, ref string tag, bool ifSpecialType, ref bool IfStop, ref bool ifFirstSpecialCharacter, Characters ch)
{
if (ifSpecialType)
{
//If it is the first character of the word, put the <b> or <i> at the beginning.
if (ifFirstSpecialCharacter)
{
fontText.Append(tag);
ifFirstSpecialCharacter = false;
IfStop = false;
}
//This is a edge case.If the last word of the text is bold or italic, put the </b> or </i>
if (isLastIndex)
{
fontText.Append(ch.Text);
fontText.Append(tag.Insert(1, "/"));
}
else
fontText.Append(ch.Text);
}
else
{
//If it is the last character of one word, add </b> or </i> at the end.
if (!IfStop && tag != "")
{
fontText.Append(tag.Insert(1, "/"));
IfStop = true;
ifFirstSpecialCharacter = true;
tag = "";
}
fontText.Append(ch.Text);
}
}
Code perfectly works by simply copy pasting and adding new reference
Microsoft.Office.Interop.Excel
代码完美地通过复制粘贴和添加新的引用
Microsoft.Office.Interop.Excel
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.