简体   繁体   中英

Reading a text file in Ruby gives wrong output

I am not an experienced ruby programmer, so bear with me. I have a problem with this specific text file containing two lines ( this issue shows up only on occasions) :

trim(0, 15447)
0, 15447

I am trying to read these two lines with the following code:

File.open(trim).each do |line|
   puts line
end

I normally obtain the normal output, but here, I get only one line, with some characters missing:

0, 1544715447)

If I want to check the character codes, I get this:

irb(main):120:0> File.open(trim).each do |line|
irb(main):121:1* puts '========================'
irb(main):122:1> puts line
irb(main):123:1> puts '........................'
irb(main):124:1> puts line.each_byte {|c| print c, ' ' }
irb(main):125:1> end
========================
0, 1544715447)
........................
116 114 105 109 40 48 44 32 49 53 52 52 55 41 13 48 44 32 49 53 52 52 55 trim(0,0, 15447
=> #<File:E:\Public\Public_videos\Soccer\1995_0129_odp_es\950129-ODP_&m3_trim30.txt>

I frankly don't understand what is going on, as I don't see any hidden character, and this happen randomly, but consistently with some files. Any suggestion to help me understand or avoid this issue would be greatly appreciated.

What happened is that your file had two "lines" separated by a carraige return character, and not a linefeed.

You showed the bytes in your file as

116 114 105 109 40 48 44 32 49 53 52 52 55 41 13 48 44 32 49 53 52 52 55

That 13 is a carriage return, which is sometimes "displayed" by the writer going back to the start of the line it is writing.

So first it wrote out

trim(0, 15447)

then it went back to the start of the same line and wrote

0, 15447

overlaying the initial line! What do you end up with?

0, 1544715447)

Your "problem" is probably best fixed by reencoding that text file of yours to use a better way to separate lines. On Unix systems, including OSX these days, the line terminator is character 10 - known as LINE FEED. Windows uses the two-character combination 13 10 (CR LF). Only old Mac systems to my knowledge used the 13.

Many text editors today will allow you to select a "line ending" option, so you might be able to just open that file, then save it using a different line ending option. FWIW my guess is that you are using Windows now, which is known for rendering CRs and LFs differently than *Nix systems.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM