简体   繁体   中英

Extract URL link from src attribute of script tag in html through XSLT

Is there a way to extract URL link specified inside src attribute of tag in a HTML file using XSLT?

The HTML file is like this -

<HTML>
<BODY>
<SCRIPT language="javascript" src="http://myspace.com" type="text/javascript"></script>
</BODY>
</HTML>

How do I code this in XSLT? I want to extract the URL in a variable which I then use it to pass to another function.

Many thanks.

Use xsl:variable to store an attribute value. Later, refer to it as $name-of-variable .

I have slightly adapted your input HTML, that is, lowercased all element names. It is uncommon to have uppercase names in HTML. Besides, your script element ends with a lowercased name anyway, which renders it malformed because XML names are case-sensitive.

Do not overcomplicate things. Depending on what you actually want to achieve it might not be necessary to store the attribute value in a variable.

Stylesheet

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:variable name="link" select="//script/@src"/>

   <xsl:template match="/">
      <link>
         <xsl:value-of select="$link"/>
      </link>
   </xsl:template>

</xsl:stylesheet>

Output

<?xml version="1.0" encoding="UTF-8"?>
<link>http://myspace.com</link>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM