简体   繁体   中英

Is there a list of possible resources that a webbrowser will download when visiting a webpage?

I am building a spider some days now, and I am in a research on how to measure a webpage total weight in bytes. Through my research I came across this problem and the simplest answer was to get the content length of the page. But there is a small problem to that, content length in bytes is not telling us anything about the images that should be downloaded to the temp folder of the browser, nor the javascript or css links from the header of the page. So I actually backed up my conclusions on how to actually measure a page in terms of how much bytes are needed to be sent from the server to the client for all the resources needed to a weppage to work properly and not to measure only the bytes of the document only. So I made a list of resources that a webbrowser should download when it visits a page:

all images <img src="someimages.jpg" alt=”somedecription” >
all js files <script type="text/javascript" src="somejs.js" ></script>
all css files <link rel="stylesheet" type="text/css" href="somecss.css">
the ico file <link rel="shortcut icon" href="someico.ico">

Are there any other resources that a browser has to download when it visits the page? In other words, what is the list of all the possible resources that a browser does download when visiting a webpage?

There is an endless number of possibilities when it comes to media types that can be downloaded. In fact, you can "invent" your own as long as you tell your server about them.

Here's a pretty good list to get you started. It's not a list of tags like <video> , <object> , <img> , <audio> , but rather a list of MIME Types.

All of these media types have a payload when downloaded and their size needs to be measured. Also, don't forget about streaming media and long polling . Measuring those payloads can be a bit of a bear.

Along with the three types you mentioned ( icon is a image file). one more file is downloaded when you load a page ie HttpHandler files (.axd files).

any other files like pdf,zip, audio, video and other mime types, will be loaded if the page requests.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM