简体   繁体   中英

Get a file url without webbrowser - C#

I'm trying to get the Url of an image, at the moment I have this code which does work but needs a webBrowser to do so.

    public void getFileUrl(HtmlDocument htmlDocument)
    {
        HtmlElementCollection htmlCollectionImage = htmlDocument.Images;
        foreach (HtmlElement htmlImage in htmlCollectionImage)
        {
            string Url = htmlImage.GetAttribute("src");
            if (Url.StartsWith("http://www.exemple.com/"))
            {
                MessageBox.Show(Url);
            }
        }
    }

I need to peace something up which doesn't require the webBrowser, but I really don't know how to do that.

Also instead of an HtmlDocument htmlDocument being fed to the method, I need to feed it a simple string .

Any alternative?

Try something like this:

static void Main()
{
    var fileUrls = GetFileUrl(@"https://stackoverflow.com/questions/34054662/get-a-file-url-without-webbrowser-c-sharp", @"https://www.gravatar.com/");

    foreach (string url in fileUrls)
    {
        Console.WriteLine(url);
    }

    Console.ReadKey();
}

public static IEnumerable<string> GetFileUrls(string url)
{
    var document = new HtmlWeb().Load(url);
    var urls = document.DocumentNode.Descendants("img")
                                    .Select(e => e.GetAttributeValue("src", null))
                                    .Where(s => s.ToLower().StartsWith(pattern));

    return urls;
}

Adapted from: How can I use HTML Agility Pack to retrieve all the images from a website?

Edited to include usage and add a pattern parameter to GetFileUrls().

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM