简体   繁体   中英

How to get content type of a web address?

I want to get type of a web address. For example this is a Html page and its page type is text/html but the type of this is text/xml . this page's type seems to be image/png but it's text/html .

I want to know how can I detect the content type of a web address like this ?

it should be something like this

    var request = HttpWebRequest.Create("http://www.google.com") as HttpWebRequest;
    if (request != null)
    {
        var response = request.GetResponse() as HttpWebResponse;

        string contentType = "";

        if (response != null)
            contentType = response.ContentType;
    }

HTTP Response header: content-type

For a more detailed response, please provide a more detailed question.

You can detect the Content-Type by the Http header of the response,for http://bayanbox.ir/user/ahmadalli/images/div.png ,the header is

Connection:keep-alive
Content-Encoding:gzip
Content-Type:text/html; charset=utf-8
Date:Tue, 14 Aug 2012 03:01:41 GMT
Server:bws
Transfer-Encoding:chunked
Vary:Accept-Encoding

Read up on HTTP headers.

HTTP headers will tell you the content type. For example:

content-type: application/xml.

There are two ways to determining the content-type

  1. the file extension invoked by the URL
  2. the http header content-type

The first one was somewhat promoted by microsoft during to old days and is not a good practice anymore.

If the client has display constraints accepting only certain content-type, it would request the server with the headers like

accept: application/json
accept: text/html
accept: application/xml

And then if the server could supply one of those and chooses XML it would return the content with the header

content-type: application/xml.

However, some services include further information like

content-type: application/xml; charset=utf-8

rather than using a header of its own for the character encoding.

using (MyClient client = new MyClient())
    {
        client.HeadOnly = true;
        string uri = "http://www.google.com";
        byte[] body = client.DownloadData(uri); // note should be 0-length
        string type = client.ResponseHeaders["content-type"];
        client.HeadOnly = false;
        // check 'tis not binary... we'll use text/, but could
        // check for text/html
        if (type.StartsWith(@"text/"))
        {
            string text = client.DownloadString(uri);
            Console.WriteLine(text);
        }
    }

Will get you the mime type from the headers without downloading the page. Just look for the content-type in the response headers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM