Python 3 Special characters escaping

Question

import urllib
from urllib.request import urlopen


address='http://www.iitb.ac.in/acadpublic/RunningCourses.jsp?deptcd=EE&year=2012&semester=1'
source= urlopen(address).read()
source=str(source)


from html.parser import HTMLParser

class MyHTMLParser(HTMLParser):
        def handle_data(self, data):
            x=str(data)
            if x != ('\r\n\t\t\t\t') or ('\r\n\t\t\t\t\t') or ('\r\n\r\n\t\t\t'):
                print("Encountered some data:",x)

parser = MyHTMLParser(strict=False)
parser.feed(source)

The above code isn't working. It is still printing '\\r\\n\\t\\t\\t\\t' stuff. Any suggestions?

Answer 1

if x != ('\r\n\t\t\t\t') or ('\r\n\t\t\t\t\t') or ('\r\n\r\n\t\t\t')

should be

if x not in ('\r\n\t\t\t\t', '\r\n\t\t\t\t\t', '\r\n\r\n\t\t\t')

or better:

if not x.isspace()

Your first code is evaluated as:

if (x != ('\r\n\t\t\t\t')) or '\r\n\t\t\t\t\t' or '\r\n\r\n\t\t\t'

Notice the last values are evaluated as themselves! Only an empty string will evaluate False thus this condition will always pass

Answer 2

may be the number of \\t and \\r etc are varying try this :

if x.replace('\r','').replace('\n','').replace('\t','').strip():
    print("Encountered some data:",x)

Python 3 Special characters escaping

Question

2 answers

solution1
1 2013-06-13 06:20:21

solution2
0 2013-06-13 06:23:01

Python 3 Special characters escaping

Question

2 answers

solution1 1 2013-06-13 06:20:21

solution2 0 2013-06-13 06:23:01

solution1
1 2013-06-13 06:20:21

solution2
0 2013-06-13 06:23:01