简体   繁体   中英

Mutable strings in Python

Please, do you know of a Python library which provides mutable strings? Google returned surprisingly few results. The only usable library I found is http://code.google.com/p/gapbuffer/ which is in C but I would prefer it to be written in pure Python.

Edit: Thanks for the responses but I'm after an efficient library. That is, ''.join(list) might work but I was hoping for something more optimized. Also, it has to support the usual stuff regular strings do, like regex and unicode.

在 Python 中可变序列类型是bytearray看到 这个链接

This will allow you to efficiently change characters in a string. Although you can't change the string length.

>>> import ctypes

>>> a = 'abcdefghijklmn'
>>> mutable = ctypes.create_string_buffer(a)
>>> mutable[5:10] = ''.join( reversed(list(mutable[5:10].upper())) )
>>> a = mutable.value
>>> print `a, type(a)`
('abcdeJIHGFklmn', <type 'str'>)
class MutableString(object):
    def __init__(self, data):
        self.data = list(data)
    def __repr__(self):
        return "".join(self.data)
    def __setitem__(self, index, value):
        self.data[index] = value
    def __getitem__(self, index):
        if type(index) == slice:
            return "".join(self.data[index])
        return self.data[index]
    def __delitem__(self, index):
        del self.data[index]
    def __add__(self, other):
        self.data.extend(list(other))
    def __len__(self):
        return len(self.data)

... and so on, and so forth.

You could also subclass StringIO, buffer, or bytearray.

How about simply sub-classing list (the prime example for mutability in Python)?

class CharList(list):

    def __init__(self, s):
        list.__init__(self, s)

    @property
    def list(self):
        return list(self)

    @property
    def string(self):
        return "".join(self)

    def __setitem__(self, key, value):
        if isinstance(key, int) and len(value) != 1:
            cls = type(self).__name__
            raise ValueError("attempt to assign sequence of size {} to {} item of size 1".format(len(value), cls))
        super(CharList, self).__setitem__(key, value)

    def __str__(self):
        return self.string

    def __repr__(self):
        cls = type(self).__name__
        return "{}(\'{}\')".format(cls, self.string)

This only joins the list back to a string if you want to print it or actively ask for the string representation. Mutating and extending are trivial, and the user knows how to do it already since it's just a list.

Example usage:

s = "te_st"
c = CharList(s)
c[1:3] = "oa"
c += "er"
print c # prints "toaster"
print c.list # prints ['t', 'o', 'a', 's', 't', 'e', 'r']

The following is fixed, see update below.

There's one (solvable) caveat: There's no check (yet) that each element is indeed a character. It will at least fail printing for everything but strings. However, those can be joined and may cause weird situations like this: [see code example below]

With the custom __setitem__ , assigning a string of length != 1 to a CharList item will raise a ValueError . Everything else can still be freely assigned but will raise a TypeError: sequence item n: expected string, X found when printing, due to the string.join() operation. If that's not good enough, further checks can be added easily (potentially also to __setslice__ or by switching the base class to collections.Sequence (performance might be different?!), cf. here )

s = "test"
c = CharList(s)
c[1] = "oa"
# with custom __setitem__ a ValueError is raised here!
# without custom __setitem__, we could go on:
c += "er"
print c # prints "toaster"
# this looks right until here, but:
print c.list # prints ['t', 'oa', 's', 't', 'e', 'r']

Efficient mutable strings in Python are arrays . PY3 Example for unicode string using array.array from standard library:

>>> ua = array.array('u', 'teststring12')
>>> ua[-2:] = array.array('u', '345')
>>> ua
array('u', 'teststring345')
>>> re.search('string.*', ua.tounicode()).group()
'string345'

bytearray is predefined for bytes and is more automatic regarding conversion and compatibility.

You can also consider memoryview / buffer , numpy arrays, mmap and multiprocessing.shared_memory for certain cases.

The FIFOStr package in pypi supports pattern matching and mutable strings. This may or may not be exactly what is wanted but was created as part of a pattern parser for a serial port (the chars are added one char at a time from left or right - see docs). It is derived from deque.

from fifostr import FIFOStr

myString = FIFOStr("this is a test")
myString.head(4) == "this"  #true
myString[2] = 'u'
myString.head(4) == "thus"  #true

(full disclosure I'm the author of FIFOstr)

Just do this
string = "big"
string = list(string)
string[0] = string[0].upper()
string = "".join(string)
print(string)

'''OUTPUT'''
> Big

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM