Best way to resize a data sequence in Python

Question

I've got a data sequence (a list) that I have to resize. I've written a function for it, but its very crude. Does anyone know of a better way to solve this?

Expected behaviour:

In all examples my input data sequence is the following: Edit: even though the example is linear, you can't expect that the sequence is build by a formula.

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

When I resize it from 10 items to 5, I expect something like the following output:

[1, 3, 5, 7, 9] or [2, 4, 6, 8, 10]

Now all this isn't very difficult when you cut the length of the data sequence in half, but the size of my output sequence is variable. I could smaller or larger than the length of the original sequence.

When I resize it from 10 items to 19 (easy number to do manually), I expect something like this:

[1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10]

Current function

def sequenceResize(source, length):
    """
    Crude way of resizing a data sequence.
    Shrinking is here a lot more accurate than expanding.
    """
    sourceLen = len(source)
    out = []
    for i in range(length):
        key = int(i * (sourceLen / length))
        if key >= sourceLen:
            key = sourceLen - 1

        out.append(source[key])
    return out

This results in the following:

>>> sequenceResize(sequence, 5)
[1, 3, 5, 7, 9]
>>> sequenceResize(sequence, 19)
[1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10]

Shrinking is accurate, but expanding the sequence is not so great.

Does anyone know of an existing, or simple way to tackle this problem properly?

Answer 1

You can use np.lisnpace:

import numpy as np

list_in = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

resize = 19

list_out = np.linspace(list_in[0], list_in[-1], num=resize)

print(np.ndarray.tolist(list_out))

Output:

[1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0]

Answer 2

Instead of determining the index directly, you should calculate the ratio of "steps" between indices in both lists. Note that there is one fewer step than there are elements in the list. Then, you can get the floor and ceil item and determine the final value based on the decimal part of the current step, getting the weighted average between the two (see figure below).

def sequenceResize(source, length):
    step = float(len(source) - 1) / (length - 1)
    for i in range(length):
        key = i * step
        low = source[int(math.floor(key))]
        high = source[int(math.ceil(key))]
        ratio = key % 1
        yield (1 - ratio) * low + ratio * high

Or a bit shorter, using divmod :

def sequenceResize(source, length):
    step = float(len(source) - 1) / (length - 1)
    for i in range(length):
        low, ratio = divmod(i * step, 1)
        high = low + 1 if ratio > 0 else low
        yield (1- ratio) * source[int(low)] + ratio * source[int(high)]

Examples:

>>> sequence = [1, 2, 4, 8, 16]
>>> list(sequenceResize(sequence, 5))
[1, 2.0, 4.0, 8.0, 16.0]
>>> list(sequenceResize(sequence, 3))
[1, 4.0, 16.0]
>>> list(sequenceResize(sequence, 10))
[1, 1.44444, 1.88889, 2.66667, 3.55556, 4.88889, 6.66667, 8.88889, 12.44444, 16.0]
>>> list(sequenceResize(sequence, 19))
[1, 1.22222, 1.44444, 1.66667, 1.88889, 2.22222, 2.66667, 3.11111, 3.55556, 4.0, 4.88889, 5.77778, 6.66667, 7.55556, 8.88889, 10.66667, 12.44444, 14.22222, 16.0]

A different example as an illustration. Blue are the original values, and red the interpolated ones.

Best way to resize a data sequence in Python

Question

Expected behaviour:

Current function

2 answers

solution1
4 2017-12-17 11:49:58

solution2
1 ACCPTED 2017-12-17 12:56:54

Best way to resize a data sequence in Python

Question

Expected behaviour:

Current function

2 answers

solution1 4 2017-12-17 11:49:58

solution2 1 ACCPTED 2017-12-17 12:56:54

solution1
4 2017-12-17 11:49:58

solution2
1 ACCPTED 2017-12-17 12:56:54