简体   繁体   English

如何逐行读取 ruby 中的文本文件(将其托管在 s3 上)?

[英]How do I read line by line a text file in ruby (hosting it on s3)?

I know I've done this before and found a simple set of code, but I cannot remember or find it:(.我知道我以前做过这个并找到了一组简单的代码,但我不记得或找不到它:(。

I have a text file of records I want to import into my Rails 3 application.我有一个要导入 Rails 3 应用程序的记录文本文件。

Each line represents a record.每行代表一个记录。 Potentially it may be tab delimited for the attributes, but am fine with just a single value as well.可能它可能是属性的制表符分隔,但也可以只使用一个值。

How do I do this?我该怎么做呢?

File.open("my/file/path", "r").each_line do |line|
  # name: "Angela"    job: "Writer"    ...
  data = line.split(/\t/)
  name, job = data.map{|d| d.split(": ")[1] }.flatten
end

Related topic相关主题

What are all the common ways to read a file in Ruby? Ruby中读取文件的常用方法有哪些?

You want IO.foreach :你想要IO.foreach

IO.foreach('foo.txt') do |line|
  # process the line of text here
end

Alternatively, if it really is tab-delimited, you might want to use the CSV library:或者,如果它确实是制表符分隔的,您可能想要使用CSV库:

File.open('foo.txt') do |f|
  CSV.foreach(f, col_sep:"\t") do |csv_row|
    # All parsed for you
  end
end
  IO.foreach("input.txt") do |line| 
    out.puts line
    # You might be able to use split or something to get attributes
    atts = line.split
  end

Have you tried using OpenURI ( http://ruby-doc.org/stdlib-2.1.2/libdoc/open-uri/rdoc/OpenURI.html )?您是否尝试过使用OpenURIhttp://ruby-doc.org/stdlib-2.1.2/libdoc/open-uri/rdoc/OpenURI.html )? You would have to make your files accessible from S3.您必须使您的文件可以从 S3 访问。

Or try using de aws-sdk gem ( http://aws.amazon.com/sdk-for-ruby ).或尝试使用 de aws-sdk gem ( http://aws.amazon.com/sdk-for-ruby )。

You can use OpenURI to read remote or local files.您可以使用OpenURI读取远程或本地文件。

Assuming that your model has an attachment named file :假设您的 model 有一个名为file的附件:

# If object is stored in amazon S3, access it through url
file_path = record.file.respond_to?(:s3_object) ? record.file.url : record.file.path
open(file_path) do |file|
  file.each_line do |line|
    # In your case, you can split items using tabs
    line.split("\t").each do |item|
      # Process item
    end
  end
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM