简体   繁体   中英

remove indentation and formatting in html string

I am using the below code snippet to get the html string of a control. This response has a lot of formatting characters like \\n, \\t, \\r for indenting the html. How do i remove this without wihtout affecting formatting of actual text within the controls.

public static string RenderControl( Control control )
{           
            string renderedString;

            using ( TextWriter writer = new StringWriter( ) )
            {
                control.RenderControl( new HtmlTextWriter( writer ) );
                renderedString = writer.ToString( );
            }                   

            return renderedString;
}

For ex-

if i see the response of a table control it looks something like -

<table>\r\n\t\t<tr>\r\n\t\t         
<td>abc\r\n def</td>...</table>

the output i need is -

<table><tr>         
<td>abc\r\n def</td>...</table>

If the generated markup is XML-compatible, then you could parse the result with an XmlReader or even an XmlDocument instance and use an XmlWriter to rewrite the markup but with XmlWriterSettings set to remove all unnecessary whitespace.

An alternative (and potentially easier) strategy is described below:

In XML (and HTML) only single whitespace characters are significant, so you could do a quick and easy fix by putting the generated markup into a regular expression replacement that removes all adjacent whitespace characters (ie replace "\\s\\s+" with "" - '\\s' is the .NET Regex symbol for any whitespace character).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM