I'm doing this little exercise... I want to reorder a string based on some weird dictionary. For example, according to my dictionary, the letters come in the order: "a", "b", "d", "c", "f", "e"
So I figured I should just overload the < operator for strings and call sorted()
Here goes:
class MyString(str):
new_dict = dict((x,i) for i,x in enumerate(["a", "b", "d", "c", "f", "e"]))
def __lt__(self,other):
return self.new_dict[self] < self.new_dict[other]
def __init__(self,x):
str.__init__(self,x)
And then
In [59]: sorted((MyString(x) for x in "abcdef"))
Out[59]: ['a', 'b', 'd', 'c', 'f', 'e']
That's awesome. Or even:
In [64]: MyString("".join(sorted((MyString(x) for x in "abcdef"))))
Out[64]: 'abdcfe'
But why can't I just do sorted(MyString("abcdef"))
?
In [70]: sorted(MyString("abcdef"))
Out[70]: ['a', 'b', 'c', 'd', 'e', 'f']
Apparently the iterator of MyString is returning strings.
In [72]: for i in MyString("abcdef"):
print type(i)
....:
<type 'str'>
<type 'str'>
<type 'str'>
<type 'str'>
<type 'str'>
<type 'str'>
What happens if I call join on MyString:
In [63]: type(MyString("").join(sorted((MyString(x) for x in "abcdef"))))
Out[63]: str
Why does MyString have str iterators?
You need to override the __getitem__
method here:
class MyString(str):
def __getitem__(self, i):
return type(self)(super(MyString, self).__getitem__(i))
This returns a new instance of the current type:
>>> for i in MyString("abcdef"):
... print type(i)
...
<class '__main__.MyString'>
<class '__main__.MyString'>
<class '__main__.MyString'>
<class '__main__.MyString'>
<class '__main__.MyString'>
<class '__main__.MyString'>
str
itself doesn't implement iteration (it has no __iter__
menthod , but does implement the sequence protocol (it has both a __len__
length method an a __getitem__
method); it is this that the for
loop ultimately uses).
If using Python 3, the str
object does have a __iter__
method and you need to override that instead:
class MyString(str):
def __iter__(self):
return (type(self)(i) for i in super().__iter__())
Note that str
is an immutable type, overriding __init__
has little influence on the instance.
For ordering, you really need to implement all of the __gt__
, __ge__
, __eq__
, etc. methods too. Use the @functools.total_ordering()
decorator to save yourself most of the work here:
from functools import total_ordering
@total_ordering
class MyString(str):
sortmap = {x: i for i, x in enumerate("abdcfe")}
def __lt__(self, other):
return self.sortmap[self] < self.sortmap[other]
# inherit __eq__ from str
def __getitem__(self, i):
return type(self)(super(MyString, self).__getitem__(i))
Last but not least, for sorting, just use the key
argument to sorted()
here:
>>> sortmap = {x: i for i, x in enumerate("abdcfe")}
>>> sorted('abcdef', key=sortmap.get)
['a', 'b', 'd', 'c', 'f', 'e']
You don need a subclass for customizing sort behavior - you can pass a key
parameter to a sort
method or sorted
call, specifying a function that gives the relative weights of each element being compared.
Like in:
def mycomp(text): myseq = ("abdcfe") weigthed = [myseq.find(char) for char in text] return weigthed # this will place -1's for chars not found in your mapping string
You should indeed use the key
parameter instead of your approach. The reason it is not working however is simply that you didn't overload the __iter__
function:
class MyString(str):
# ...
def __iter__(self):
for x in super().__iter__():
yield self.__class__(x)
In Python 2 you can use
class MyString(str):
# ...
def __iter__(self):
for x in super(MyString, self).__str__():
yield self.__class__(x)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.