简体   繁体   English

是否可以通过pdf-reader读取rubyzip中的pdf文件?

[英]Is it possible to read pdf file inside rubyzip by pdf-reader?

Is it possible to read a PDF file inside a zip file by pdf-reader?是否可以通过 pdf-reader 读取zip文件中的PDF文件? I tried this code but it does not work.我试过这段代码,但它不起作用。

require 'zip'

Zip::File.open('/path/to/zipfile') do |zip_file|
zip_file.each do |entry|
if entry.directory?
  puts "#{entry.name} is a folder!"
elsif entry.symlink?
  puts "#{entry.name} is a symlink!"
elsif entry.file?
  puts "#{entry.name} is a regular file!"

  reader = PDF::Reader.new("#{entry.name}")
  page = reader.pages.each do |page|
  puts page.text
  end
else
  puts "#{entry.name} is something unknown"
end
end
end

Thanks谢谢

PDF::Reader validates that the input is a "IO-like object or a filename" based on 2 criteria. PDF::Reader验证输入是基于 2 个条件的“类 IO 对象或文件名”。

  • Determines if it is "IO-like" based on the object responding to seek and read根据响应seekread的对象确定它是否是“类 IO”
  • Determines if it is a File based on File.file?确定它是否是基于File.file?File File.file?

Excerpt Source :摘录来源

def extract_io_from(input)
   if input.respond_to?(:seek) && input.respond_to?(:read)
     input
   elsif File.file?(input.to_s)
     StringIO.new read_as_binary(input)
   else
     raise ArgumentError, "input must be an IO-like object or a filename"
   end
 end

Unfortunately while Zip::InputStream emulates an IO object fairly well it does not define seek and therefor it does not pass the validation above.不幸的是,虽然Zip::InputStream很好地模拟了一个IO对象,但它没有定义seek ,因此它没有通过上面的验证。 What you can do is create a new StringIO from the contents of the Zip::InputStream via你可以做的是通过Zip::InputStream的内容创建一个新的StringIO

StringIO.new(entry.get_input_stream.read)

This will guarantee that PDF::Reader sees this as an "IO-like object" and processes it appropriately.这将保证PDF::Reader将其视为“类 IO 对象”并对其进行适当处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM