Extract plain text excerpt from HTML

Question

I have @Html.Raw(Model.Content) in a razor page, I want to extract excerpt (plain text) truncated at word boundary near to 232-240 characters, followed by ... . Is there any helper for all this?

Something equivalent to Ruby on Rail's truncate_html gem. Usage:

strip_tags(truncate_html(my_model.content, :length => 240, :omission => '...'))

Answer 1

Solved it with these extension methods:

public static string TruncateHtml(this string input, int length = 300, 
                                   string ommission = "...")
{
    if (input == null || input.Length < length)
        return input;
    int iNextSpace = input.LastIndexOf(" ", length);
    return string.Format("{0}" + ommission, input.Substring(0, (iNextSpace > 0) ? 
                                                          iNextSpace : length).Trim());
}

public static string StripTags(this string markup)
{
    try
    {
        StringReader sr = new StringReader(markup);
        XPathDocument doc;
        using (XmlReader xr = XmlReader.Create(sr,
                           new XmlReaderSettings()
                           {
                               ConformanceLevel = ConformanceLevel.Fragment
                               // for multiple roots
                           }))
        {
            doc = new XPathDocument(xr);
        }

        return doc.CreateNavigator().Value; // .Value is similar to .InnerText of  
                                           //  XmlDocument or JavaScript's innerText
    }
    catch
    {
        return string.Empty;
    }
}

Usage:

@Html.Raw(Model.Content.StripTags().TruncateHtml(240, "..."))

Answer 2

You can create your own method, see this example where we create it as an extension. You can then use it like this "String with a lot of text".TrimString(11); Output will be "String with..."

You should be able to add logic to not break within a word.

public static string TrimString(this string text, int length = 300)
{
    if (text.Length > length)
    {
        return text.Substring(text.Length - (length - 3)) + "...";
    }

    return text;
}

Extract plain text excerpt from HTML

Question

2 answers

solution1
3 2013-11-18 01:00:08

solution2
0 2013-11-17 23:23:02

Extract plain text excerpt from HTML

Question

2 answers

solution1 3 2013-11-18 01:00:08

solution2 0 2013-11-17 23:23:02

solution1
3 2013-11-18 01:00:08

solution2
0 2013-11-17 23:23:02