[英]What is the most efficient algorithm to find the midpoint of the index of a repeated sequence of numbers?
a=[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1 -1, -1, -1, -1,-1, 0, 0, 0, 0, 0]
I want to be able to obtain mid-point of index of the repeated points ie 我希望能够获得重复点索引的中点,即
output_vector = [2, 8, 13, 19]
ie output_vector[0] is index of midpoint of first sequence 0, 0, 0, 0, 0
即output_vector [0]是第一序列的中点的索引
0, 0, 0, 0, 0
output_vector[1] is midpoint of the second repeated sequence 1, 1, 1, 1, 1, 1, 1
output_vector [1]是中点的第二重复序列的
1, 1, 1, 1, 1, 1, 1
output_vector[2] is midpoint of the second repeated sequence -1, -1, -1, -1,-1
output_vector [2]是第二个重复序列
-1, -1, -1, -1,-1
中点
One way is to use itertools.groupby
to find groups and calculate their midpoints: 一种方法是使用
itertools.groupby
查找组并计算其中点:
from itertools import groupby
a = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1,-1, 0, 0, 0, 0, 0]
groups = [list(g) for _, g in groupby(a)]
output_vector = [sum(1 for x in groups[:i] for _ in x) + len(x) // 2 for i, x in enumerate(groups)]
# [2, 8, 14, 19]
The itertools
method is probably better and cleaner. itertools
方法可能更好,更清洁。 Nonetheless here's a method that uses math
and statistics
and goes through finding the median of the start and end indexes of each set of numbers. 尽管如此,这是一种使用
math
和statistics
的方法,它会查找每组数字的起始索引和终止索引的中值。
import math
import statistics as stat
a = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0]
lastNum = None
startIdx = 0
midpts = []
for idx, x in enumerate(a):
if lastNum is not None and lastNum != x or idx == len(a) - 1:
midpts.append(math.floor(stat.median([startIdx, idx])))
startIdx = idx
lastNum = x
print(midpts)
# [2, 8, 14, 19]
Another itertools based solution, but more efficient. 另一个基于itertools的解决方案,但效率更高。
from itertools import groupby
a = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1,-1, 0, 0, 0, 0, 0]
output = []
psum = 0
for glen in (sum(1 for i in g) for k, g in groupby(a)):
output.append(psum + glen // 2)
psum += glen
print(output)
C++ based implementation of @Matt M's answer @Matt M答案的基于C ++的实现
template<typename T>
std::vector<size_t> getPeaks(std::vector<T>& input_vector) {
std::vector<size_t> output;
T lastNum = 10000;
size_t startIdx = 0;
for (size_t i = 0; i < input_vector.size(); ++i) {
if ((lastNum != 10000 and lastNum != input_vector[i]) || (i == input_vector.size() - 1)) {
auto medianIdx = findMedian(startIdx, i);
output.emplace_back(medianIdx);
startIdx = i;
}
lastNum = input_vector[i];
}
return output;
} }
size_t findMedian(size_t start, size_t end) {
return start + (end - start) / 2;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.