简体   繁体   English

在我自己的网站上抓取一个页面,以显示缩略图(Rails)

[英]Scrape a page on my own site for images to display thumbnails (Rails)

I have a rails app with posts and post comments. 我有一个发布帖子和发表评论的Rails应用程序。 At the top of the post page, I want to display thumbnails (automated) of all of the image contained within the post and post comments. 在帖子页面的顶部,我要显示帖子中包含的所有图像的缩略图(自动)和帖子评论。 As users add post comments with images, the thumbnails at the top will be updated to reflect the new images. 当用户添加带有图像的评论时,顶部的缩略图将被更新以反映新图像。 Two options come to mind, but none of them seem perfect: 我想到了两个选择,但似乎没有一个是完美的:

1) Scrape the page using ScrAPI or similar 1)使用ScrAPI或类似内容刮取页面

2) Create methods in post and post_comment models that scan content for images, which would require some kind of image regex and database queries 2)在post和post_comment模型中创建方法以扫描图像内容,这将需要某种图像正则表达式和数据库查询

It seems like there should be a better way, with some Javascript magic or something. 似乎应该有一种更好的方法,使用一些Javascript魔术之类的东西。 Any ideas? 有任何想法吗?

URL regexes are a solved problem, so I'd venture for option 2 that checks the posted content for URLs. URL正则表达式是一个已解决的问题,因此,我冒险使用选项2,该方法检查发布的内容中的URL。 Then you can go one step further and do a HEAD request against the image to check to see its content-type. 然后,您可以再走一步,对图像进行HEAD请求,以查看其内容类型。

If the content type is a known image, download it and store it somewhere (db/s3, etc) for rendering later. 如果内容类型是已知的图像,请下载该图像并将其存储在某处(db / s3等)以供以后渲染。

I'd put these on a background queue like Delayed Job, though, as those external requests would affect the user's experience. 不过,我会将它们放在“延迟作业”这样的后台队列中,因为这些外部请求会影响用户的体验。

You want to have the updated images after every post, so option 1 sounds better. 您希望在每次发布后都有更新的图像,因此选项1听起来更好。 And this would still work for browsers that don't support javascript. 而且这对于不支持javascript的浏览器仍然适用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM