vb.net函数的HTML百分比

Question

I have some articles saved in database.On certain pages I wanted to show certain percentage of the article based on some settings. 我有一些文章保存在数据库中。在某些页面上，我想根据某些设置显示文章的某些百分比。 eg 80% of the article 例如，文章的80％

Problem is that as html is not plain text if I take certain percentage of the string length then formatting get disturbed Can any help me in some function where I provide string and new length (which will be less then the old string length) And it will return me truncated html without disturbing the formating I have tried with 问题是，如果我采用一定百分比的字符串长度，则html不是纯文本，那么格式就会受到干扰。在我提供字符串和新长度（将小于旧字符串长度）的某些函数中，有什么可以帮助我的吗？返回我截断的html而不干扰我尝试过的格式

Private Function HtmlSubstring(html As String, maxlength As Integer) As String
        'initialize regular expressions
        Dim htmltag As String = "</?\w+((\s+\w+(\s*=\s*(?:"".*?""|'.*?'|[^'"">\s]+))?)+\s*|\s*)/?>"
        Dim emptytags As String = "<(\w+)((\s+\w+(\s*=\s*(?:"".*?""|'.*?'|[^'"">\s]+))?)+\s*|\s*)/?></\1>"

        'match all html start and end tags, otherwise get each character one by one..
        Dim expression As Regex = New Regex(String.Format("({0})|(.?)", htmltag))
        Dim matches As MatchCollection = expression.Matches(html)

        Dim i As Integer = 0
        Dim content As New StringBuilder()
        For Each match As Match In matches
            If match.Value.Length = 1 AndAlso i < maxlength Then
                content.Append(match.Value)
                i += 1
                'the match contains a tag
            ElseIf match.Value.Length > 1 Then
                content.Append(match.Value)
            End If
        Next

        Return Regex.Replace(content.ToString(), emptytags, String.Empty)
    End Function

But didn't work always 但是并不总是有效

Answer 1

I'm pretty sure that there is no built-in .NET method to do what you ask. 我很确定没有内置的.NET方法可以执行您所要求的操作。 However, consider the following method: 但是，请考虑以下方法：

Your HTML page is probably structured, ie, it has paragraphs, headings, etc.: 您的HTML页面可能是结构化的，即它具有段落，标题等：

<h1>...</h1>
<p>...</p>
<h2>...</h2>
<p>...<more tags>...</more tags></p>
<h2>...</h2>
<p>...</p>
...

What you could do is: 您可以做的是：

Use a HTML parser (the HTML agility pack is often mentioned in this context) and parse your HTML into a data structure. 使用HTML解析器（在此上下文中经常提到HTML敏捷性包）并将HTML解析为数据结构。
Take the first 80% of the top-level tags . 选取顶层标签的前80％。 For example, if the root node of your HTML content has ten children, take the first eight: 例如，如果HTML内容的根节点有十个子节点，则取前八个：
```
 <h1>...</h1> <p>...</p> <p>...</p> <h2>...</h2> <p> ... <more tags> ... </more tags> ... </p> <p>...</p> <p>...<more tags>...</more tags>...</p> <p>...</p> --------------- <h2>...</h2> <p>...</p> 
```

If your article is approximately evenly spaced (ie, your long and short paragraphs average out over the course of the article), this will give you approximately 80% of the text without breaking any HTML formatting. 如果您的文章间距大致均匀（即，您的长篇和短篇文章在整个文章过程中平均），这将为您提供大约 80％的文本，而不会破坏任何HTML格式。 As an additional benefit, you won't be splitting the text mid-line or mid-paragraph. 另外一个好处是，您不会在中间行或中间段拆分文本。

Answer 2

Finally following has work quite well for me 最后跟随对我来说很好

 Private Function HtmlSubstring(ByRef html As String, maxlength As Integer) As String
    'initialize regular expressions
    Const htmltag As String = "</?\w+((\s+\w+(\s*=\s*(?:"".*?""|'.*?'|[^'"">\s]+))?)+\s*|\s*)/?>"
    'match all html start and end tags, otherwise get each character one by one..
    Dim expression As Regex = New Regex(String.Format("({0})|(.?)", htmltag))
    Dim matches As MatchCollection = expression.Matches(html)
    Dim i As Integer = 0
    Dim isEndingSet As Boolean = False
    Dim content As StringBuilder = New StringBuilder()
    For Each match As Match In matches
        If match.Value.Length = 1 AndAlso i < maxlength Then
            content.Append(match.Value)
            'the match contains a tag
            i += 1
        ElseIf match.Value.Length > 1 Then
            If (isEndingSet AndAlso (match.Value.ToLower() = "<br />" OrElse match.Value.ToLower() = "<br>")) Then
                Continue For
            End If
            content.Append(match.Value)
        End If
        If (i = maxlength AndAlso Not isEndingSet) Then
            content.Append("....")
            isEndingSet = True
        End If
    Next

    Return content.ToString()
End Function

vb.net函数的HTML百分比

问题描述

2 个解决方案

解决方案1
1 2013-03-01 07:02:24

解决方案2
0 已采纳 2013-03-08 07:22:53

vb.net函数的HTML百分比

问题描述

2 个解决方案

解决方案1 1 2013-03-01 07:02:24

解决方案2 0 已采纳 2013-03-08 07:22:53

解决方案1
1 2013-03-01 07:02:24

解决方案2
0 已采纳 2013-03-08 07:22:53