简体   繁体   中英

Change HTML with CSS inline to use data Url

In want to, on the fly, change an HTML similar to this:

<html><head><style>
body { 
    background: transparent url(http://example.com/image.gif) no-repeat right bottom;
}
</style><head>
<body>
    <img src="http://example.com/image2.gif"/>
</body>
</html>

To (urls are cut):

<html><head><style>
body { 
    background: transparent url(...) no-repeat right bottom;
}
</style>
<head>
<body>
    <img src="...."/>
</body>
</html>

Now I use this code:

private string EmbebedImages(string strHtml)
{
    var doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml(strHtml);

    foreach (var imgNode in doc.DocumentNode.SelectNodes("//img[@src]"))
    {
    string url = imgNode.Attributes["src"].Value;
    if (url.StartsWith("http"))
    {
        using (var webClient = new WebClient())
        {
        var imageAsByteArray = webClient.DownloadData(url);
        string mimeType = MimeMapping.GetMimeMapping(url);

        imgNode.Attributes["src"].Value = "data:" + mimeType + ";base64," +
            Convert.ToBase64String(imageAsByteArray);
        }
    }
    }

    return doc.DocumentNode.OuterHtml;
}

But my code ignores urls in CSS.

Is it possible to make this change simple? I tried with some css-libraries, but I can't find a simple form...

you can't do that with HtmlAgilityPack but Try Regex

using System.Text.RegularExpressions;

private string EmbebedImages(string strHtml) {
    var htmlString = .......load html string....;
    string currentURL;

    var images_url = Regex.Matches(htmlString, @"(?:https?:\/\/.*?\.(gif|png|jpg|jpeg))");
    foreach(var url in images_url) {
        currentURL = url.ToString();
        using(var webClient = new WebClient()) {
            var imageAsByteArray = webClient.DownloadData(currentURL);
            string mimeType = MimeMapping.GetMimeMapping(currentURL);
            string dataURL = "data:" + mimeType + ";base64," + Convert.ToBase64String(imageAsByteArray);
            htmlString = htmlString.Replace(currentURL, dataURL);
        }
    }

    return htmlString;
}

Though this isn't applicable for OP's case, if in Python one wanted to convert URLs to Data URIs in CSS, here is how I did it:

import requests
import re
import base64

def url_to_base64(url, memo=None):
    if memo is None:
        memo = {}
    if url in memo:
        return memo[url]
    res = requests.get(url)
    mime_type = res.headers['content-type']
    base64_data = base64.b64encode(res.content).decode('utf-8')
    data_url = "data:{};base64,{}".format(mime_type, base64_data)
    memo[url] = data_url
    return data_url


def embed_urls(html, memo=None):
    if memo is None:
        memo = {}
    pattern = re.compile(r"url\('(https?://.*?)'\)")
    urls = set(re.findall(pattern, html))
    for url in urls:
        base64_data = url_to_base64(url, memo)
        html = html.replace(url, base64_data)
    return html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM