简体   繁体   English

如何在rails中的ruby中将pdf文件转换为xlsx文件

[英]how to convert pdf file into xlsx file in ruby on rails

I have uploaded 1 PDF then convert it to xlsx file. 我已上传1个PDF,然后将其转换为xlsx文件。 I have tried different ways but not getting actual output.pdf2xls only displays single line format not whole file data. 我尝试了不同的方法,但未获得实际输出。pdf2xls仅显示单行格式,而不显示整个文件数据。 I want whole PDF file data to display on xlsx file. 我希望整个PDF文件数据显示在xlsx文件上。

i have one method convert PDF to xlsx but not display proper format. 我有一种方法将PDF转换为xlsx,但无法显示正确的格式。

def do_excel_to_pdf
    @user=User.create!(pdf: params[:pdf])
    @path_in = @user.pdf.path
    temp1 = @user.pdf.path
    @path_out = @user.pdf.path.slice(0..@user.pdf.path.rindex(/\//))
    query = "libreoffice --headless --invisible --convert-to pdf " + @path_in + " --outdir " + @path_out
    system(query)
    file = @path_out+@user.pdf.original_filename.slice(0..@user.pdf.original_filename.rindex('.')-1)+".pdf"
    send_file file, :type=>"application/msexcel", :x_sendfile=>true
end

if any one use please help me, any gem any script. 如果有任何用途,请帮助我,任何宝石或任何脚本。

not able to find options to convert from PDF to xsls but API Options available for converting PDF to Image and PDF to powerpoint(Link Given Below) Not sure u can change the requirement to show results in other formats!! 无法找到将PDF转换为xsls的选项,但是API选项可用于将PDF转换为图像并将PDF转换为powerpoint(下面给出的链接)不确定您是否可以更改要求以其他格式显示结果!

http://www.convertapi.com/ http://www.convertapi.com/

I would start with reading from the PDF, inserting the data in the XLSX is easy, if you have problems with that ask another question and specify which gem you use and what you tried for that part. 我将从读取PDF开始,将数据插入XLSX很容易,如果您对此有疑问,请问另一个问题,并指定要使用哪个gem,以及对该部分尝试了什么。

You use libreoffice to read the PDF but according to the FAQ your PDF needs to be hybrid, perhaps that is the problem. 您使用libreoffice来阅读PDF,但是根据常见问题解答,您的PDF需要混合使用,也许就是问题所在。

As an alternative you could try to use some conversion tool for ebooks like the one in Calibre but I'm afraid you will lose too much formatting to recover the data you need. 作为替代方案,您可以尝试对电子书使用某些转换工具,例如Calibre中的电子书,但恐怕您会丢失太多格式来恢复所需的数据。

All depends on how the data in your PDF is structured, if regular text without much formatting and positioning it can be as easy as using the gem pdf-reader 一切都取决于PDF中数据的结构,如果常规文本没有太多格式和位置,就可以像使用gem pdf阅读器一样容易

I used it in the past and my data had a lot of formatting - you would be surprised to know how complicated the PDF structure is - so I had to specify for each field at which location exactly which data had to be read, not for the faint of heart. 我过去使用过它,并且我的数据有很多格式-您会惊讶地知道PDF结构多么复杂-因此我必须为每个字段指定必须在哪个位置准确读取哪些数据,而不是为心虚。

Here a simple example. 这里有个简单的例子。

require 'pdf/reader' # gem install pdf-reader

reader = PDF::Reader.new("my.pdf")
reader.pages.each do |page|
  # puts page.text
  page.page_object.each do |e|
    p e.first.contents
  end
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM