简体   繁体   中英

How do I insert a word into a list in python?

I have a list.

I need to add a word into the list but im not sure how to do so.

For example,

I have a list = ['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']

You can use a specialist library such as sortedcontainers , which is more efficient than a naive list.sort after each insertion. Complexity of SortedList.add is ~O(log n ).

from sortedcontainers import SortedList

lst = SortedList(['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony'])

lst.add('Beatrice')

print(lst)

SortedList(['Alice', 'Amy', 'Andy', 'Beatrice', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony'])

If you're certain you have a sorted list, you could implement a naive insertion sort

def insertion_sort(lst, newval, key=None):
    """insertion_sort(lst, newval) modifies the pre-sorted lst by inserting
    newval in its sorted order
    """

    if key is None:
        key = type(lst[0]).__lt__

    for i, val = enumerate(lst):
        if key(val, newval):
            continue
        else:
            # newval is less than val, so we insert here
            lst.insert(i, newval)

Or you can, less-naively, use the stdlib bisect module to insert for you.

import bisect

l = ['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']
bisect.insort(l, "Andrew")  # or insort_left

Try this:

>>> l = ['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']
>>> l.append('Beatrice')
>>> l.sort()
>>> l
['Alice', 'Amy', 'Andy', 'Beatrice', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']
>>> 

First, to "check matching letters" between two strings, all you really need is < . Strings compare lexicographically: 'abc' < 'abd' < 'acd' < 'acdd' .


With a linked list, you have to search nodes from the head to find the location. Keep track of a prev and next node as you go, and as soon as you find a next.head > value , insert the new node after prev . (If you're using a bare-node implementation, make sure your function returns the head—otherwise, there's no way to insert before the head.)

Of course this automatically means linear time to find the right position (and, if you're using immutable nodes, also linear time to build the new nodes back up to the head).

Given your implementation, that could look like these methods on SingleLinkedList :

def find_pos(self, element):
    '''Returns the position before element, or None if no such position'''
    prev, node = None, self._head
    while node:
        if node.element > element:
            return prev
        prev, node = node, node.next
    return prev

def insert(self, element):
    pos = self.find_pos(element)
    if pos:
        # add it after the node we found
        node = Node(element, pos.next)
        pos.next = node
    else:
        # add it before the current head
        node = Node(element, self._head)
        self._head = node
    self._size += 1

With a random-access data structure like an array (a Python list ), you can bisect to find the right location in log time. But with an array, you still need linear time to do the insert, because all of the subsequent values have to be shifted up. (Although this is usually linear with a much faster constant than the linked-list search.)

bisect.insort(lst, value)

One last thing: If you're doing a ton of inserts in a row, it's often more efficient to batch them up. In fact, just calling extend and then sort may actually be faster than insort ing each one, if the number of elements being added is a sizable fraction of the list.


If you want to get the best of both worlds, you need a more complex data structure:

  • A balanced binary search tree of some kind (red-black, AVL, etc.) is the traditional answer, although it tends to be pretty slow in practice.
  • A wider search tree, like any of the B-tree variants, avoids most of the performance costs of binary trees (and lets you search with a higher log base, to boot).
  • A skiplist is a linked list with log N "higher-level" linked lists threaded through it (or stacked above it), so you can bisect it. And there are other variations on this "indexed list" concept.
  • There are multiple Python implementations of complicated hybrids, like a deque/rope structure with an optional B-tree-variant stacked on top.

Popular implementations include blist.sortedlist , sortedcontainers.SortedList , pybst.AVLTree , etc.

But really, almost any implementation of any such structure you find in Python is going to have this behavior built in. So the right answer will probably just be something like this:

lst.add(val)

Using a binary tree, insertion can be performed in O(height_of_tree) :

class Tree:
  def __init__(self, value = None):
    self.right, self.left, self.value = None, None, value
  def __lt__(self, _node):
    return self.value < getattr(_node, 'value', _node)
  def insert_val(self, _val):
    if self.value is None:
      self.value = _val
    else:
      if _val < self.value:
         if self.left is None:
           self.left = Tree(_val)
         else:
           self.left.insert_val(_val)
      else:
          if self.right is None:
            self.right = Tree(_val)
          else:
            self.right.insert_val(_val)
  def flatten(self):
     return [*getattr(self.left, 'flatten', lambda :[])(), self.value, *getattr(self.right, 'flatten', lambda :[])()]

t = Tree()
l = ['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']
for i in l:
  t.insert_val(l)

t.insert_val('Beatrice')
print(t.flatten())

Output:

['Alice', 'Amy', 'Andy', 'Beatrice', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']

With a linked-list, you can perform the add and insert operations in a single method, applying additional logic:

class LinkedList:
  def __init__(self, value=None):
    self.value = value
    self._next = None
  def __lt__(self, _node):
    return True if self._next is None else _node[0] > self._next.value[0]
  def insert_val(self, _val):
    if self.value is None:
      self.value = _val
    else:
      if self._next is None or self._next < _val:
         getattr(self._next, 'insert_val', lambda x:setattr(self, '_next', LinkedList(x)))(_val)
      else:
          _temp_next = self._next._next
          self._next._next = LinkedList(_val)
          self._next._next._next = _temp_next
  def flatten(self):
     return [self.value, *getattr(self._next, 'flatten', lambda :[])()]
  @classmethod
  def load_list(cls):
    _l = cls()
    for i in ['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']:
       _l.insert_val(i)
    return _l

l = LinkedList.load_list()
print(l.flatten())
>>>['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']
l.insert_val('Beatrice')
print(l.flatten())
>>>['Alice', 'Amy', 'Andy', 'Beatrice', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']

The bisect module provides support for maintaining a list in sorted order without having to sort the list after each insertion.

The bisect.insort_left() method will "insert the item in the list in sorted order":

import bisect

a = ['Alice', 'Amy', 'Andy', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']
x = 'Beatrice'

bisect.insort_left(a, x)

print(a)

['Alice', 'Amy', 'Andy', 'Beatrice', 'Betty', 'Eric', 'Peter', 'Richard', 'Tony']

In raw python, you could do:

def ins_sorted(toIns, list): # insert a value, with respect to a sorted list
    for i in range(0, len(list)):
        if(toIns < list[i]):
            list.insert(i, toIns) # insert the value to the left of the one it DOESNT match
            break # efficiency!
    return list

Why does this work? Strings can be compared just like numbers in python! A < B , C > B , etc.

To be fair: It's not the most efficient option, and bisect.insort is better, but if you want your own code you can control, there you are.

Timing code:

import timeit

setupStr='''def ins_sorted(toIns, list): # insert a value, with respect to a sorted list
    for i in range(0, len(list)):
        if(toIns < list[i]):
            list.insert(i, toIns) # insert the value to the left of the one it DOESNT match
            break # efficiency!
    return list'''

a = timeit.timeit('ins_sorted("c", ["a", "b", "d", "e"])', number=100000, setup=setupStr)
print(a)

b = timeit.timeit('bisect.insort(["a", "b", "d", "e"], "c")', number=100000, setup='import bisect')
print(b)

Timing results:

0.25098993408028036
0.05763813108205795

Try using the bisect library:

>>> import bisect
>>> someList = ["a", "b", "d"]
>>> bisect.insort(someList,'c')
>>> someList
['a', 'b', 'c', 'd']
>>> 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM