简体   繁体   中英

Fastest way to construct tuple from elements of list (Python)

I have 3 NumPy arrays, and I want to create tuples of the i-th element of each list. These tuples represent keys for a dictionary I had previously defined.

Ex:

List 1: [1, 2, 3, 4, 5]

List 2: [6, 7, 8, 9, 10]

List 3: [11, 12, 13, 14, 15]

Desired output: [mydict[(1,6,11)],mydict[(2,7,12)],mydict[(3,8,13)],mydict[(4,9,14)],mydict[(5,10,15)]]

These tuples represent keys of a dictionary I have previously defined (essentially, as input variables to a previously calculated function). I had read that this is the best way to store function values for lookup.

My current method of doing this is as follows:

[dict[x] for x in zip(l1, l2, l3)]

This works, but is obviously slow. Is there a way to vectorize this operation, or make it faster in any way? I'm open to changing the way I've stored the function values as well, if that is necessary.

EDIT: My apologies for the question being unclear. I do in fact, have NumPy arrays. My mistake for referring to them as lists and displaying them as such. They are of the same length.

Your question is a bit confusing, since you're calling these NumPy arrays, and asking for a way to vectorize things, but then showing lists, and labeling them as lists in your example, and using list in the title. I'm going to assume you do have arrays.

>>> l1 = np.array([1, 2, 3, 4, 5])
>>> l2 = np.array([6, 7, 8, 9, 10])
>>> l3 = np.array([11, 12, 13, 14, 15])

If so, you can stack these up in a 2D array:

>>> ll = np.stack((l1, l2, l3))

And then you can just transpose that:

>>> lt = ll.T

This is better than vectorized; it's constant-time. NumPy is just creating another view of the same data, with different striding so it reads in column order instead of row order.

>>> lt
array([[ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14],
       [ 5, 10, 15]])

As miradulo points out, you can do both of these in one step with column_stack :

>>> lt = np.column_stack((l1, l2, l3))

But I suspect you're actually going to want ll as a value in its own right. (Although I admit I'm just guessing here at what you're trying to do…)


And of course if you want to loop over these rows as 1D arrays instead of doing further vectorized work, you can:

>>> for row in lt:
...:     print(row)
[ 1  6 11]
[ 2  7 12]
[ 3  8 13]
[ 4  9 14]
[ 5 10 15]

Of course, you can convert them from 1D arrays to tuples just by calling tuple on each row. Or… whatever that mydict is supposed to be (it doesn't look like a dictionary—there's no key-value pairs, just values), you can do that.

>>> mydict = collections.namedtuple('mydict', list('abc'))
>>> tups = [mydict(*row) for row in lt]
>>> tups
[mydict(a=1, b=6, c=11),
 mydict(a=2, b=7, c=12),
 mydict(a=3, b=8, c=13),
 mydict(a=4, b=9, c=14),
 mydict(a=5, b=10, c=15)]

If you're worried about the time to look up a tuple of keys in a dict, itemgetter in the operator module has a C-accelerated version. If keys is a np.array , or a tuple , or whatever, you can do this:

for row in lt:
    myvals = operator.itemgetter(*row)(mydict)
    # do stuff with myvals

Meanwhile, I decided to slap together a C extension that should be as fast as possible (with no error handling, because I'm lazy it should be a tiny bit faster that way—this code will probably segfault if you give it anything but a dict and a tuple or list):

static PyObject *
itemget_itemget(PyObject *self, PyObject *args) {
  PyObject *d;
  PyObject *keys;
  PyArg_ParseTuple(args, "OO", &d, &keys);    
  PyObject *seq = PySequence_Fast(keys, "keys must be an iterable");
  PyObject **arr = PySequence_Fast_ITEMS(seq);
  int seqlen = PySequence_Fast_GET_SIZE(seq);
  PyObject *result = PyTuple_New(seqlen);
  PyObject **resarr = PySequence_Fast_ITEMS(result);
  for (int i=0; i!=seqlen; ++i) {
    resarr[i] = PyDict_GetItem(d, arr[i]);
    Py_INCREF(resarr[i]);    
  }
  return result;
}

Times for looking up 100 random keys out of a 10000-key dictionary on my laptop with python.org CPython 3.7 on macOS:

  • itemget.itemget : 1.6µs
  • operator.itemgetter : 1.8µs
  • comprehension: 3.4µs
  • pure-Python operator.itemgetter : 6.7µs

So, I'm pretty sure anything you do is going to be fast enough—that's only 34ns/key that we're trying to optimize. But if that really is too slow, operator.itemgetter does a good enough job moving the loop to C and cuts it roughly in half, which is pretty close to the best possibly result you could expect. (It's hard to imagine looping up a bunch of boxed-value keys in a hash table in much less than 16ns/key, after all.)

Define your 3 lists. You mention 3 arrays, but show lists (and call them that as well):

In [112]: list1,list2,list3 = list(range(1,6)),list(range(6,11)),list(range(11,16))

Now create a dictionary with tuple keys:

In [114]: dd = {x:i for i,x in enumerate(zip(list1,list2,list3))}
In [115]: dd
Out[115]: {(1, 6, 11): 0, (2, 7, 12): 1, (3, 8, 13): 2, (4, 9, 14): 3, (5, 10, 15): 4}

Accessing elements from that dictionary with your code:

In [116]: [dd[x] for x in zip(list1,list2,list3)]
Out[116]: [0, 1, 2, 3, 4]
In [117]: timeit [dd[x] for x in zip(list1,list2,list3)]
1.62 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Now for an array equivalent - turn the lists into a 2d array:

In [118]: arr = np.array((list1,list2,list3))
In [119]: arr
Out[119]: 
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15]])

Access the same dictionary elements. If I used column_stack I could have omitted the .T , but that's slower. (array transpose is fast)

In [120]: [dd[tuple(x)] for x in arr.T]
Out[120]: [0, 1, 2, 3, 4]
In [121]: timeit [dd[tuple(x)] for x in arr.T]
15.7 µs ± 21.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Notice that this is substantially slower. Iteration over an array is slower than iteration over a list. You can't access elements of a dictionary in any sort of numpy 'vectorized' fashion - you have to use a Python iteration.

I can improve on the array iteration by first turning it into a list:

In [124]: arr.T.tolist()
Out[124]: [[1, 6, 11], [2, 7, 12], [3, 8, 13], [4, 9, 14], [5, 10, 15]]
In [125]: timeit [dd[tuple(x)] for x in arr.T.tolist()]
3.21 µs ± 9.67 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Array construction times:

In [122]: timeit arr = np.array((list1,list2,list3))
3.54 µs ± 15.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [123]: timeit arr = np.column_stack((list1,list2,list3))
18.5 µs ± 11.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

With the pure Python itemgetter (from v3.6.3) there's no savings:

In [149]: timeit operator.itemgetter(*[tuple(x) for x in arr.T.tolist()])(dd)
3.51 µs ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

and if I move the getter definition out of the time loop:

In [151]: %%timeit idx = operator.itemgetter(*[tuple(x) for x in arr.T.tolist()]
     ...: )
     ...: idx(dd)
     ...: 
482 ns ± 1.85 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM