I'm creating an abstract web crawler class Crawler. this class can only be used with a child class that implements some methods.
(removed init method because it's irrelevant)
class Crawler():
def get_image_source_url(self, image_page_soup):
return NotImplementedError("method get_image_source_url must be implemented")
def get_image_thumbnail_url(self, image_page_soup):
return NotImplementedError("method get_image_thumbnail_url must be implemented")
def get_tags_container(self, image_page_soup):
return NotImplementedError("method get_tags_container must be implemented")
and they are implemented as such in the child class:
class LittlevisualsCrawler(Crawler):
def get_image_containers(self, image_page_soup):
return image_page_soup.find('article', class_='photo')
def get_image_source_url(self, image_page_soup):
return image_page_soup.find('img')['src']
def get_image_thumbnail_url(self, image_page_soup):
return image_page_soup.find('img')['data-1280u']
def get_tags_container(self, image_page_soup):
return image_page_soup.find('ul', class_='tags')
I want to wrap the function in a try-except block no matter what it is implemented as. so the end result would be: (pseudoish code)
class Crawler():
def get_image_source_url(self, image_page_soup):
if not_implemented:
return NotImplementedError("method get_image_thumbnail_url must be implemented")
else:
try:
whatever the child class chooses to do
except Exception:
handle exception however i decide in the parent class
I don't know how to check if it's implemented and how to wrap the function without the child class overriding everything, am I clear as to what I'm trying to achieve?
Note : This post contains two different implementation techniques to allow for what you want.
The easiest way around this issue is to refactor the code so that the child classes does not directly override the function used by the public interface.
Instead provide the public functionality directly in the base-class, and make children override a "worker" ( implementation detail ) of said function that is later called by the function invoked "from the outside" .
Example Implementation
class Base (object):
def get_message (self):
try:
return self.get_message_impl ()
except Exception as detail:
print ("error:", detail)
return None
def get_message_impl (self):
raise Exception ("Not Implemented")
class Foo (Base):
def get_message_impl (self):
return "Hello World";
class Bar (Base):
def get_message_impl (self):
raise Exception ("Bar.get_message_impl always fails!")
f = Foo ()
b = Bar ()
f_msg = f.get_message ()
b_msg = b.get_message ()
print ("f_msg:", f_msg)
print ("b_msg:", b_msg)
output
error: Bar.get_message_impl always fails!
f_msg: Hello World
b_msg: None
If you would like to maintain the possibility to override the public functionality presented in the base class, while still being able to easily call a "protected" version of the functions at a later time, you could create a simply wrapper such as the below:
class Base (object):
class Protected (object):
def __init__ (self, target):
self.target = target
def get_message (self):
try:
return self.target.get_message ()
except Exception as detail:
print ("error:", detail)
return None
def __init__ (self):
self.protected = self.Protected (self)
def get_message (self):
raise Exception ("Not Implemented")
class Foo (Base):
def get_message (self):
return "Hello World";
class Bar (Base):
def get_message (self):
raise Exception ("Bar.get_message_impl always fail!")
f = Foo ()
b = Bar ()
f_msg = f.protected.get_message () # protected from failure
b_msg = b.protected.get_message () # protected from failure
b_msg = b.get_message () # will raise exception
NotImplementedError
is an exception; don't return it, raise it as an exception:
class Crawler():
def get_image_source_url(self, image_page_soup):
raise NotImplementedError("method get_image_source_url must be implemented")
def get_image_thumbnail_url(self, image_page_soup):
raise NotImplementedError("method get_image_thumbnail_url must be implemented")
def get_tags_container(self, image_page_soup):
raise NotImplementedError("method get_tags_container must be implemented")
You don't need to 'wrap' anything here. If the subclass implements the method, the original method won't be called an no exception is raised.
If further processing is needed, and the subclass implementation is optional but should not be visible to the outside API, you could require subclasses to implement a method by a different name, called from the base class method:
class Crawler():
def _image_source_url_implementation(self, image_page_soup):
raise NotImplementedError("method get_image_source_url must be implemented")
def get_image_source_url(self, image_page_soup):
try:
url = self._image_source_url_implementation(image_page_soup)
except NotImplementedError:
# do something default
url = 'something else'
# process the produced URL further
return processed_result
Here self.get_image_source_url()
delegates to an optional self._image_source_url_implementation()
method.
If you want to wrap the child type's implementation in something, then you need to use different method names. For example:
class Crawler:
def get_image_source_url(self, image_page_soup):
try:
self._get_image_source_url(image_page_soup)
except NotImplementedError:
raise
except Exception:
print('Some exception occurred, fall back to something else')
# …
def _get_image_source_url(self, image_page_soup):
raise NotImplementedError()
class ChildCrawler(Crawler):
def _get_image_source_url(self, image_page_soup):
doStuff()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.