[英]Why when downloading images from a website im getting exception: An exception occurred during a WebClient request?
The code: 编码:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using HtmlAgilityPack;
using System.IO;
using System.Text.RegularExpressions;
using System.Xml.Linq;
using System.Net;
using System.Web;
using System.Threading;
using DannyGeneral;
using GatherLinks;
namespace GatherLinks
{
class RetrieveWebContent
{
HtmlAgilityPack.HtmlDocument doc;
string imgg;
int images;
public RetrieveWebContent()
{
images = 0;
}
public List<string> retrieveImages(string address)
{
try
{
doc = new HtmlAgilityPack.HtmlDocument();
System.Net.WebClient wc = new System.Net.WebClient();
List<string> imgList = new List<string>();
doc.Load(wc.OpenRead(address));
HtmlNodeCollection imgs = doc.DocumentNode.SelectNodes("//img[@src]");
if (imgs == null) return new List<string>();
foreach (HtmlNode img in imgs)
{
if (img.Attributes["src"] == null)
continue;
HtmlAttribute src = img.Attributes["src"];
imgList.Add(src.Value);
if (src.Value.StartsWith("http") || src.Value.StartsWith("https") || src.Value.StartsWith("www"))
{
images++;
string[] arr = src.Value.Split('/');
imgg = arr[arr.Length - 1];
wc.DownloadFile(src.Value, @"d:\MyImages\" + imgg);
}
}
return imgList;
}
catch
{
Logger.Write("There Was Problem Downloading The Image: " + imgg);
return null;
}
}
}
}
Link for example that give this exception: 例如,给出该异常的链接:
http://vanessawest.tripod.com/bundybowman.jpg http://vanessawest.tripod.com/bundybowman.jpg
Its getting into the foreach loop after few itertions its jumping to the catch. 经过几次迭代后,它进入了foreach循环,跳转到了陷阱。
Now if the link is from another site for example: 现在,例如,如果链接来自另一个站点:
www.walla.co.il www.walla.co.il
So there are no problem its getting into the foreach loop and get all the images. 因此,它进入foreach循环并获取所有图像没有问题。
This is the full exception message for the link : 这是该链接的完整异常消息:
http://vanessawest.tripod.com/bundybowman.jpg http://vanessawest.tripod.com/bundybowman.jpg
System.Net.WebException was caught
HResult=-2146233079
Message=An exception occurred during a WebClient request.
Source=System
StackTrace:
at System.Net.WebClient.DownloadFile(Uri address, String fileName)
at System.Net.WebClient.DownloadFile(String address, String fileName)
at GatherLinks.RetrieveWebContent.retrieveImages(String address) in d:\C-Sharp\GatherLinks\GatherLinks-2\GatherLinks\GatherLinks\RetrieveWebContent.cs:line 55
InnerException: System.ArgumentException
HResult=-2147024809
Message=Illegal characters in path.
Source=mscorlib
StackTrace:
at System.IO.Path.CheckInvalidPathChars(String path, Boolean checkAdditional)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access)
at System.Net.WebClient.DownloadFile(Uri address, String fileName)
InnerException:
I dont know why on walla link its working no problems and on tripod its making exception. 我不知道为什么在Walla上将其工作正常,而在三脚架上将其作为例外。
http://vanessawest.tripod.com/bundybowman.jpg
http://vanessawest.tripod.com/bundybowman.jpg
Are you sure that's the one causing problem? 您确定那是引起问题的原因吗? My guess is that your are getting some unwanted characters in the local file name because of your string split logic to get to the file name.
我的猜测是,由于使用字符串拆分逻辑来获取文件名,您在本地文件名中得到了一些不需要的字符。
string[] arr = src.Value.Split('/');
imgg = arr[arr.Length - 1];
Instead, try this: 相反,请尝试以下操作:
imgg = Path.GetFileName(new Uri(src.Value).LocalPath);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.