简体   繁体   中英

html to XSLT conversion using C#

I am trying to change a html page to a xslt page using C#, for example if i have something like

<a href="#compantnameURL#">#companyname#</a>

i have to convert it into

<a href="{test/companynameURL}"><xsl:value-of select="test/companyname" /></a>

I have a xsl file which has all these values. I dont want to replace the values here as they are to be further processed before replacing the original values. The problem i am facing here is i have a trouble identifying(to replace the xml construct) if the value is in the attribute level of the tag or in the value level of the tag.

I am trying to use the regular expressions on it . Can someone help??

Html Agility Pack is the way to go. Don't forget to add the reference to it. This code illustrates one way of using HTML Agility Pack to create an XSLT which is what I think you want to do.

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(@"<html>" + 
        "<a href='#compantnameURL1#'>#companyname1#</a>" +
        "<a href='#compantnameURL2#'>#companyname2#</a>" +
        "</html>");

    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.IndentChars = ("    ");
    settings.Encoding = Encoding.UTF8;

    using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
    {                                
        writer.WriteStartDocument();
        writer.WriteStartElement("xsl", "stylesheet", "http://www.w3.org/1999/XSL/Transform");
        writer.WriteStartElement("template", "http://www.w3.org/1999/XSL/Transform");
        writer.WriteAttributeString("match", "/");
        writer.WriteElementString("apply-templates", "http://www.w3.org/1999/XSL/Transform", "");
        writer.WriteEndElement();
        writer.WriteStartElement("template", "http://www.w3.org/1999/XSL/Transform");
        writer.WriteAttributeString("match", "test/");
        foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a"))
        {
            HtmlAttribute att = link.Attributes["href"];
            writer.WriteStartElement("a");
                writer.WriteStartElement("attribute", "http://www.w3.org/1999/XSL/Transform");
                    writer.WriteStartElement("value-of", "http://www.w3.org/1999/XSL/Transform");
                        writer.WriteAttributeString("select", att.Value);
                    writer.WriteEndElement();
                writer.WriteEndElement();
                writer.WriteStartElement("value-of", "http://www.w3.org/1999/XSL/Transform");
                    writer.WriteAttributeString("select", link.InnerText);
                writer.WriteEndElement();
            writer.WriteEndElement();
        }
        writer.WriteEndElement();
        writer.WriteEndDocument();

    }

I'm not aware of a component that will get you all to XSLT, but the HTML Agility Pack is wonderful for any sort of HTML manipulation. The parser will provide a complete object tree with attributes, tags, styles, etc clearly defined, and it's easily queryable with XSLT.

Also, for a good discussion of parsing HTML with regex, see the first answer on this post .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM