简体   繁体   中英

Scrape a page on my own site for images to display thumbnails (Rails)

I have a rails app with posts and post comments. At the top of the post page, I want to display thumbnails (automated) of all of the image contained within the post and post comments. As users add post comments with images, the thumbnails at the top will be updated to reflect the new images. Two options come to mind, but none of them seem perfect:

1) Scrape the page using ScrAPI or similar

2) Create methods in post and post_comment models that scan content for images, which would require some kind of image regex and database queries

It seems like there should be a better way, with some Javascript magic or something. Any ideas?

URL regexes are a solved problem, so I'd venture for option 2 that checks the posted content for URLs. Then you can go one step further and do a HEAD request against the image to check to see its content-type.

If the content type is a known image, download it and store it somewhere (db/s3, etc) for rendering later.

I'd put these on a background queue like Delayed Job, though, as those external requests would affect the user's experience.

You want to have the updated images after every post, so option 1 sounds better. And this would still work for browsers that don't support javascript.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM