Why python2 shows \r (Raw escaped) and python3 does not?

Question

I have been having a path error: No file or directory found for hours. After hours of debugging, I realised that python2 added an invisible '\\r' at the end of each line.

The input: (trainval.txt)

Images/K0KKI1.jpg Labels/K0KKI1.xml
Images/2KVW51.jpg Labels/2KVW51.xml
Images/MMCPZY.jpg Labels/MMCPZY.xml
Images/LCW6RB.jpg Labels/LCW6RB.xml

The code I used to debug the error

with open('trainval.txt', "r") as lf:
 for line in lf.readlines():
  print ((line),repr(line))
  img_file, anno = line.strip("\n").split(" ")
  print(repr(img_file), repr(anno))

Python2 output:

("'Images/K0KKI1.jpg'", "'Labels/K0KKI1.xml\\r'")
('Images/2KVW51.jpg Labels/2KVW51.xml\r\n', "'Images/2KVW51.jpg Labels/2KVW51.xml\\r\\n'")
("'Images/2KVW51.jpg'", "'Labels/2KVW51.xml\\r'")
('Images/MMCPZY.jpg Labels/MMCPZY.xml\r\n', "'Images/MMCPZY.jpg Labels/MMCPZY.xml\\r\\n'")
("'Images/MMCPZY.jpg'", "'Labels/MMCPZY.xml\\r'")
('Images/LCW6RB.jpg Labels/LCW6RB.xml\r\n', "'Images/LCW6RB.jpg Labels/LCW6RB.xml\\r\\n'")
("'Images/LCW6RB.jpg'", "'Labels/LCW6RB.xml\\r'")

Python3 output:

Images/K0KKI1.jpg Labels/K0KKI1.xml
 'Images/K0KKI1.jpg Labels/K0KKI1.xml\n'
'Images/K0KKI1.jpg' 'Labels/K0KKI1.xml'
Images/2KVW51.jpg Labels/2KVW51.xml
 'Images/2KVW51.jpg Labels/2KVW51.xml\n'
'Images/2KVW51.jpg' 'Labels/2KVW51.xml'
Images/MMCPZY.jpg Labels/MMCPZY.xml
 'Images/MMCPZY.jpg Labels/MMCPZY.xml\n'
'Images/MMCPZY.jpg' 'Labels/MMCPZY.xml'
Images/LCW6RB.jpg Labels/LCW6RB.xml
 'Images/LCW6RB.jpg Labels/LCW6RB.xml\n'
'Images/LCW6RB.jpg' 'Labels/LCW6RB.xml'

As annoying as it was, it was that small '\\r' who caused the path error. I could not see it in my console until I write the script above. My question is: Why is this '\\r' even there? I did not create it. Something somewhere added it there. It would be helpful if someone could tell me what is the use of this small ' r ' , why did it appear in python2 and not in python3 and how to avoid getting bugs due to it.

Answer 1

there's probably a subtle difference of processing between Windows text file in python 2 & 3 versions.

The issue here is that your file has a Windows text format, and contains one or several carriage return chars before the linefeed. A quick & generic fix would be to change:

img_file, anno = line.strip("\n").split(" ")

by just:

img_file, anno = line.split()

Without arguments str.split is very smart:

it splits according to any kind of whitespace (linefeed, space, carriage return, tab)
it removes empty fields (no need for strip after all)

So use that cross-platform/python version agnostic form unless you need really specific split operation, and your problems will be history.

As an aside, don't do for line in lf.readlines(): but just for line in lf: , it will read & yield the lines one by one, handy when the file is big so you don't consume too much memory.

Why python2 shows \r (Raw escaped) and python3 does not?

Question

1 answers

solution1
2 2018-08-02 08:50:18

Why python2 shows \r (Raw escaped) and python3 does not?

Question

1 answers

solution1 2 2018-08-02 08:50:18

solution1
2 2018-08-02 08:50:18