简体   繁体   中英

Ruby File gets not reading content after last blank line \n

I'm trying to write a very simple ruby script that opens a text file, removes the \\n from the end of lines UNLESS the line starts with a non-alphabetic character OR the line itself is blank (\\n).

The code below works fine, except that it skips all of the content beyond the last \\n line. When I add \\n\\n to the end of the file, it works perfectly. Examples: A file with this text in it works great and pulls everything to one line:

Hello
there my
friend how are you?

becomes Hello there my friend how are you?

But text like this:

Hello

there

my friend
how
are you today

returns just Hello and There , and completely skips the last 3 lines. If I add 2 blank lines to the end, it will pick up everything and behave as I want it to.

Can anybody explain to me why this happens? Obviously I know I can fix this instance by appending \\n\\n to the end of the source file at the start, but that doesn't help me understand why the .gets isn't working as I'd expect.

Thanks in advance for any help!

source_file_name = "somefile.txt"
destination_file_name = "some_other_file.txt"
source_file = File.new(source_file_name, "r")

para = []
x = ""
while (line = source_file.gets)
  if line != "\n"
    if line[0].match(/[A-z]/)   #If the first character is a letter
        x += line.chomp + " "
    else
      x += "\n" + line.chomp + " "
    end
  else
    para[para.length] = x
    x = ""
  end
end

source_file.close

fixed_file = File.open(destination_file_name, "w")
para.each do |paragraph|
  fixed_file << "#{paragraph}\n\n"
end
fixed_file.close

Your problem lies in the fact you only add your string x to the para array if and only if you encounter an empty line ('\\n'). Since your second example does not contain the empty line at the end, the final contents of x are never added to the para array.

The easy way to fix this without changing any of your code, is add the following lines after closing your while loop:

if(x != "")
    para.push(x)
end

I would prefer to add the strings to my array right away rather then appending them onto x until you hit an empty line, but this should work with your solution.

Also,

para.push(x)
para << x

both read much nicer and look more straightforward than

para[para.length] = x

That one threw me off for a second, since in non-dynamic languages, that would give you an error. I advise using one of those instead, simply because it's more readable.

您的代码对我来说就像是ac代码, ruby的方式应该是这样,它可以替代您上面的100行。

File.write "dest.txt", File.read("src.txt")

It's easier to use a multiline regex. Maybe:

source_file.read.gsub(/(?<!\n)\n([a-z])/im, ' \\1')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM