简体   繁体   中英

Writing HTML Source from External URL to String

I'm using WebClient's DownloadString functionality to store the HTML source of a webpage to a string in a C# web application (ASPX). The issue is that the string seems to be ending when it gets to a part of the HTML source that has a URL.

I tried writing the string to a text file and this is how it ends:

<body class="page">
    <div id="container">
      <div id="header">
      <a href="http://

The original web source code has about 50 lines after this that my application doesn't include. It doesn't even finish the line it's on leading me to think the slashes are some sort of string break sequence in C# maybe?

To troubleshoot I tried WebClient DownloadFile and saved the HTML source at my specified web address directly to a text file. This worked and the data was not truncated. When I tried reading this text file to a string though, the same thing happened.

Any ideas? I've spent hours searching online and stuffing around and I can't figure this out! I've also tried alternative methods for writing data from a URL to a string however the same issue occurs.

Thanks in advance.

Use Fiddler to intercept the HTTP request and see what the server sends back to you. If Fiddler shows the same response content as DownloadString, then your problem is on the server. Otherwise it's your client.

Perhaps you could use DownloadData instead of DownloadString?

I finally figured it out and thought I'd post my solution for future reference for others.

After messing around with it further I found a workaround using the following code (courtesy of this post: Unable to Fetch a Webpage )...

    StringBuilder sb  = new StringBuilder();
    byte[]        buf = new byte[8192];
    HttpWebRequest  request  = (HttpWebRequest)
        WebRequest.Create(url);
    HttpWebResponse response = (HttpWebResponse)
        request.GetResponse();
    Stream resStream = response.GetResponseStream();
    string tempString = null;
    int    count      = 0;
    do
    {
        count = resStream.Read(buf, 0, buf.Length);
        if (count != 0)
        {
            tempString = Encoding.ASCII.GetString(buf, 0, count);
            sb.Append(tempString);
        }
    }
    while (count > 0);
    Console.WriteLine(sb.ToString());

I'm still not entirely clear on why this workaround was necessary but I'm just happy I found a solution!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM