简体   繁体   中英

Why when downloading images from a website im getting exception: An exception occurred during a WebClient request?

The code:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using HtmlAgilityPack;
using System.IO;
using System.Text.RegularExpressions;
using System.Xml.Linq;
using System.Net;
using System.Web;
using System.Threading;
using DannyGeneral;
using GatherLinks;

namespace GatherLinks
{
    class RetrieveWebContent
    {
        HtmlAgilityPack.HtmlDocument doc;
        string imgg;
        int images;

        public RetrieveWebContent()
        {
            images = 0;
        }

        public List<string> retrieveImages(string address)
        {
            try
            {
                doc = new HtmlAgilityPack.HtmlDocument();
                System.Net.WebClient wc = new System.Net.WebClient();
                List<string> imgList = new List<string>();
                doc.Load(wc.OpenRead(address));
                HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]");
                if (imgs == null) return new List<string>();

                foreach (HtmlNode img in imgs)
                {
                    if (img.Attributes["src"] == null)
                        continue;
                    HtmlAttribute src = img.Attributes["src"];

                    imgList.Add(src.Value);
                    if (src.Value.StartsWith("http") || src.Value.StartsWith("https") || src.Value.StartsWith("www"))
                    {
                        images++;
                        string[] arr = src.Value.Split('/');
                        imgg = arr[arr.Length - 1];
                        wc.DownloadFile(src.Value, @"d:\MyImages\" + imgg);
                    }
                }

                return imgList;
            }
            catch
            {
                Logger.Write("There Was Problem Downloading The Image: " + imgg);
                return null;

            }
        }
    }
}

Link for example that give this exception:

http://vanessawest.tripod.com/bundybowman.jpg

Its getting into the foreach loop after few itertions its jumping to the catch.

Now if the link is from another site for example:

www.walla.co.il

So there are no problem its getting into the foreach loop and get all the images.

This is the full exception message for the link :

http://vanessawest.tripod.com/bundybowman.jpg

System.Net.WebException was caught
  HResult=-2146233079
  Message=An exception occurred during a WebClient request.
  Source=System
  StackTrace:
       at System.Net.WebClient.DownloadFile(Uri address, String fileName)
       at System.Net.WebClient.DownloadFile(String address, String fileName)
       at GatherLinks.RetrieveWebContent.retrieveImages(String address) in d:\C-Sharp\GatherLinks\GatherLinks-2\GatherLinks\GatherLinks\RetrieveWebContent.cs:line 55
  InnerException: System.ArgumentException
       HResult=-2147024809
       Message=Illegal characters in path.
       Source=mscorlib
       StackTrace:
            at System.IO.Path.CheckInvalidPathChars(String path, Boolean checkAdditional)
            at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
            at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access)
            at System.Net.WebClient.DownloadFile(Uri address, String fileName)
       InnerException:

I dont know why on walla link its working no problems and on tripod its making exception.

http://vanessawest.tripod.com/bundybowman.jpg

Are you sure that's the one causing problem? My guess is that your are getting some unwanted characters in the local file name because of your string split logic to get to the file name.

string[] arr = src.Value.Split('/');
imgg = arr[arr.Length - 1];

Instead, try this:

imgg = Path.GetFileName(new Uri(src.Value).LocalPath);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM