简体   繁体   中英

Handling resizing over 100,000 uploaded images per day

Currently we are resizing images in real time. Our users are uploading some 50K to 100K images per day. And our server is pegged at 100% cpu usage 24/7. And page load times are slow. Ram is not an issue.

We have a Dual cpu intel xeon L5630 2.13GHz server with 24 gigs of ram. And we are using a simple image resize script to make the thumbnails from the original image. But this is maxing out the cpu's.

I'd like to attack this problem both via hardware and software.

  • On the hardware side we are going to order another server with Dual Xeon 3GHz processors. This server will handle the image processing seperate from the website.

  • On the software I'd like to ask someone with experience what kind of software they are using to process images with low cpu overhead.

Any thoughts would be appreciated.

First, analyze your read/write balance and (more importantly) read distribution

Most high-volume users discover that it's better to send uploads directly to blob storage, unmodified, and perform resizing dynamically on-request. There's a lot of benefits to this approach (agility, mobile support, control), but primarily it can reduce resource utilization. Most content never gets eyeballs; why waste resources on it?

Second, get a server-safe, server-optimized imaging library

It appears you're already looking at my software, ImageResizer , which is the only server-friendly choice for .NET as far as I know. It also offers a range of pipelines, each with its own tradeoffs. If you want raw speed, use the WIC pipeline; I expect you'll see 10-20x better performance than your current script. The quality of the WIC pipeline isn't quite as good as the default pipeline, but it is 2-4x faster, and you won't be able to tell the difference on most images.

Consider getting help

I've assisted many companies in scaling past the 1 million and 10 million image barriers; there are a few architecture pitfalls to avoid, but at this scale it's good to take a very in-depth look at your needs. Is Varnish or a CDN an option? These can drastically help in scaling past the 500 image/sec barriers.

Firstly, do they need to be processed in real-time? – If not you could look at a Queue bases system that utilities other servers / quiet time on the current server to process thumbnails.

Are the images only resized once, or is it possible for a user to request the image again causing another resize? – If so they you need to look at caching.

How large are the images being rezied and how many are being processed at once – if it's spread out during the whole day?

Are the page load time delays caused by resizing the image during the postback? – if so they adding that instruction to a new thread would improve the response time.

First, you have a good idea to separate image processing from the main server.

The best answer depends on more details from your particular situation : what software environment, what software used for resizing images, how it is configured, etc. It might be that optimizing there can yield a powerful speed improvement. For example, are there multiple cores, can GPU-based processing be used ?

That said, since every image resizing operation is independent from others, there is a generic solution. You can easily distribute the work on as many servers as you wish. Build one (or more if needed) extra image processing server. Then make so that resizing request are distributed. This can be done at several levels (Is resizing called by web clients via URL ? Randomize the server name in URLs you send to them. Is resizing called by business layer ? Randomize the server name there.).

If you need more specific answer, please provide more details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM