简体   繁体   English

如何从另一个站点将远程映像下载到Ruby on Rails中的file_column?

[英]How do I download a remote image from another site to a file_column in Ruby on Rails?

first question, hopefully I don't mess it up :) 第一个问题,希望我不会弄糟:)

A bit of a Ruby on Rails newbie (also Ruby newbie) and have stumbled upon a problem with the intended behavior of the application. Ruby on Rails的新手(也是Ruby新手),并且偶然发现了应用程序预期行为的问题。

I have a file_column :image in model picture that belongs to model product, which can have many pictures. 我在属于模型产品的模型图片中有一个file_column:image,它可以包含许多图片。

The file_column works just fine when used as I think it's meant to be used and that's for uploading image using <%= file_column_field "picture", "image" %> etc. That part works just fine. file_column在使用时工作得很好,因为我认为这是要使用的,并且使用<%= file_column_field“ picture”,“ image”%>等来上传图像。那部分工作得很好。

The problem comes with the intention of having a text field where user can enter a css -selector for an image tag on their site (they've registered the site and the path to the page where the image should be). 问题在于要有一个文本字段,以便用户可以在其站点上为图像标签输入css-选择器(他们已经注册了该站点以及应该放置图像的页面的路径)。 I haven't been able to figure out how to properly download the image from that other site "under the hood". 我一直无法弄清楚如何从“其他”网站上正确下载该图像。

Using these two methods both result in Do not know how to handle a string with value 'GIF89ad..... followed by loads of "binary". 使用这两种方法都将导致“不知道如何处理带有值'GIF89ad .....”的字符串,后跟“二进制”的负载。

Method 1: 方法1:

url = URI.parse(picture_www.external_url)
Net::HTTP.start(url.host, url.port) {|http|
  resp = http.get(url.path)
  picture_www.image = resp.body unless resp.nil?
}

Method 2: 方法2:

res = open(picture_www.external_url)
picture_www.image = res.read unless res.nil?

The external_url contains the correct url and the download goes ok, so the problem seems to be in the way I'm trying to assign the image to the file_column field. external_url包含正确的url,下载正常,因此问题似乎出在我试图将图像分配给file_column字段的方式中。 Naturally the problem could be the way I'm downloading the image, I have no idea TBH where the problem actually lies... :) 自然地,问题可能出在我下载图像的方式上,我不知道问题出在哪里,实际上是TBH ... :)

Anyone able to help me please? 有人可以帮助我吗?

Update: 更新:

Trying to use a tempfile "causes undefined method 'original_filename' for" etc 尝试使用临时文件“导致未定义的方法'original_filename'为”等

  Net::HTTP.start(url.host, url.port) {|http|
    resp = http.get(url.path)
    tempfile = Tempfile.new('test.jpg')
    File.open(tempfile.path, 'wb') do |f|
      f.write resp.body
    end
    picture_www.image = tempfile unless resp.nil?
  }

Update2: 更新2:

Debugging shows me that an uploaded file has attributes @content_type ("image/jpeg" for instance) and @original_path (file name without path) under @_dc_obj and @tmpfile when the tempfile I created does not. 调试显示,当我创建的临时文件没有时,上载的文件在@_dc_obj和@tmpfile下具有@content_type(例如,“ image / jpeg”)属性和@original_path(无路径的文件名)属性。 Setting these properly would perhaps make this work? 正确设置这些设置也许会使这项工作成功? How do I set those properly? 如何正确设置? And if setting those values properly, would the file downloading be done "properly"? 如果正确设置这些值,是否可以“正确”完成文件下载? After ofcourse re-structuring the code once I get a working solution. 当我得到一个可行的解决方案后,当然要对代码进行重组。

Update3: 更新3:

From Minver's answer I got the solution for "original_filename" issue and this code seems to work: 从Minver的答案中,我得到了“ original_filename”问题的解决方案,此代码似乎有效:

  io = open(picture_www.external_url)
  def io.original_filename; base_uri.path.split('/').last; end
  io.original_filename.blank? ? nil : io
  picture_www.image = io

No idea though, if this is the "proper" way to do this or not, but this is what I'll be using for now unless some "clearly the right way to do it" solution appears :) 虽然不知道这是否是执行此操作的“正确”方法,但这是我现在将要使用的方法,除非出现某些“显然正确的执行方法”解决方案:)

-Pkauko -普考

The UrlUpload method by Joe Martinez is a good solution but the code is missing a key method. Joe Martinez的UrlUpload方法是一个很好的解决方案,但是代码缺少关键方法。 If you over-ride the method_missing , you should always also over-ride the respond_to? 如果您超越了method_missing ,那么您也应该始终超越对response_to?的要求。 method as well. 方法。 In this case it is especially important since some software uses respond_to? 在这种情况下,这尤其重要,因为某些软件使用了response_to ?? when deciding whether to do a multipart-post. 在决定是否要进行多篇文章时。

For example, the Faraday gem does this: 例如,法拉第(Faraday)宝石执行以下操作:

def has_multipart?(body)
  body.values.each do |v|
    if v.respond_to?(:content_type)
      return true
    elsif v.respond_to?(:values)
      return true if has_multipart?(v)
    end
  end
  false
end

So, if you are going to use the UrlUpload code above, I suggest you add the following method: 因此,如果您要使用上面的UrlUpload代码,建议您添加以下方法:

  def respond_to?(symbol)
    attachment_data.respond_to?(symbol) || super
  end

Then Faraday and other related gems will be able to use an instance of this class to generate a proper multipart-post. 然后,法拉第和其他相关的宝石将能够使用此类的实例来生成适当的multipart-post。

Here ya go 你去

require 'open-uri'

class UrlUpload
  EXTENSIONS = {
    "image/jpeg" => ["jpg", "jpeg", "jpe"],
    "image/gif" => ["gif"],
    "image/png" => ["png"]
  }
  attr_reader :original_filename, :attachment_data
  def initialize(url)
    @attachment_data = open(url)
    @original_filename = determine_filename
  end

  # Pass things like size, content_type, path on to the downloaded file
  def method_missing(symbol, *args)
    if self.attachment_data.respond_to? symbol
      self.attachment_data.send symbol, *args
    else
      super
    end
  end

  private
    def determine_filename
      # Grab the path - even though it could be a script and not an actual file
      path = self.attachment_data.base_uri.path
      # Get the filename from the path, make it lowercase to handle those
      # crazy Win32 servers with all-caps extensions
      filename = File.basename(path).downcase
      # If the file extension doesn't match the content type, add it to the end, changing any existing .'s to _
      filename = [filename.gsub(/\./, "_"), EXTENSIONS[self.content_type].first].join(".") unless EXTENSIONS[self.content_type].any? {|ext| filename.ends_with?("." + ext) }
      # Return the result
      filename
    end
end

# Make it always write to tempfiles, never StringIO
OpenURI::Buffer.module_eval {
  remove_const :StringMax
  const_set :StringMax, 0
}

I don't know but maybe this is what you are looking for. 我不知道,但是也许这就是您想要的。 When you save the image you provide a css_selector and gets a image file in return. 保存图像时,请提供一个css_selector并获取一个图像文件作为回报。

This is the view: 这是视图:

<%= form_for(@image) do |f| %>

  <div class="field">
    <%= f.label :css_selector %><br />
    <%= f.text_field :css_selector %>
  </div>

  <div class="actions">
    <%= f.submit %>
  </div>

<% end %>

and this is the model: 这是模型:

class Picture < ActiveRecord::Base

  require 'open-uri' # Required to download the photo
  require 'mechanize' # Good gem to parse html pages

  belongs_to :product

  # Define the css_selector (not required as a filed in the database)
  attr_accessor :css_selector

  # Before we save the image, we download the photo if image has a css_selector value    
  before_save :download_remote_photo, :if => :css_selector_provided?

  private

    # Check if the attribute is provided      
    def css_selector_provided?
      !self.css_selector.blank?
    end

    # This method opens the page where the photo is
    # and grab the url to the image using a css-selector
    def fetch_photo_url
      agent = Mechanize::new
      page = agent.get(HERE_IS_THE_URL_TO_THE_PAGE_YOU_WANNA_SCRAPE)
      doc = Nokogiri::HTML(page.body)

      image_element = doc.at_css(self.css_selector) # Get the image on that page using the css selector
      image_url = image_element[:src]
    end

    def download_remote_photo
      self.image = do_download_remote_photo(fetch_photo_url)
    end

    def do_download_remote_photo(photo_url)
      io = open(URI.parse(URI.escape(photo_url)))
      def io.original_filename; base_uri.path.split('/').last; end
      io.original_filename.blank? ? nil : io
      rescue # catch url errors with validations instead of exceptions (Errno::ENOENT, OpenURI::HTTPError, etc...)
    end

end

Haven't tested the code but I hope you get the idea! 尚未测试代码,但希望您能理解!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM