简体   繁体   English

根据另一个列表中的元素进行列表划分

[英]List divide based on element from another list

I have two lists as follows我有两个列表如下

a = [646, 650, 654, 658, 662, 666, 670, 674, 678, 682, 686, 690, 694, 698, 702, 706, 13565, 13569, 13573, 13577, 13581, 13585, 13589, 13593, 13597, 13601, 13605, 13609, 13613, 13617, 13621, 13625, 13629, 13633, 13637, 13641, 13645, 13649, 13653, 13657, 13661, 21237, 21241, 21245, 21249, 21253, 21257, 21261, 21265, 21269, 21273, 21277, 21281, 21285, 21289, 21293, 21297, 21301, 21305, 21309, 21313, 21317, 21321, 21325, 21329, 21333, 21337, 21341, 21345]

b = [646, 706, 13661, 21345]

So basically I want to break list a into smaller chunks based on start stop values from list b .所以基本上我想根据列表b的开始停止值将列表a分成更小的块。 Eg Something like this例如像这样的东西

[
[646, 650, 654, 658, 662, 666, 670, 674, 678, 682, 686, 690, 694, 698, 702, 706],
[13565, 13569, 13573, 13577, 13581, 13585, 13589, 13593, 13597, 13601, 13605, 13609, 13613, 13617, 13621, 13625, 13629, 13633, 13637, 13641, 13645, 13649, 13653, 13657, 13661],
[21237, 21241, 21245, 21249, 21253, 21257, 21261, 21265, 21269, 21273, 21277, 21281, 21285, 21289, 21293, 21297, 21301, 21305, 21309, 21313, 21317, 21321, 21325, 21329, 21333, 21337, 21341, 21345]
]

Can someone please help me figure this out?有人可以帮我解决这个问题吗?

Solution 1: use bisect解决方案 1:使用 bisect

I would solve this problem by using the bisect module to find where each item in a would be inserted in b to determine which bin an item belongs to.我将通过使用bisect模块来解决这个问题,以查找a每个项目将插入b中的位置,以确定项目属于哪个 bin。

This solution does not require that a be sorted, but it does require that b be sorted.此解决方案不需要对a进行排序,但确实需要对b进行排序。

bin_boundaries = sorted(b)
results = [[] for _ in range(len(bin_boundaries)+1)]
for i in a:
    pos = bisect.bisect_left(bin_boundaries, i)
    results[pos].append(i)
print(results)

Now, you did not specify whether you wanted an item that was equal to the boundary in the previous or next bin.现在,您没有指定是否需要与上一个或下一个 bin 中的边界相等的项目。 I placed it in the previous bin.我把它放在以前的箱子里。 If you meant the next, replace bisect_left by bisect_right above.如果您的意思是下一个,请用上面的bisect_right替换bisect_left

I also output two more bins that your expected output shows: the first bin has items smaller than the first bin boundary, and the last bin items greater than the last boundary.我还 output 您预期的 output 显示的还有两个箱:第一个箱的项目小于第一个箱边界,最后一个箱的项目大于最后一个边界。 Add results = results[1:-1] if you want to remove those edge bins.如果要删除那些边缘箱,请添加results = results[1:-1]

Solution 2: just loop over the list for each bin解决方案 2:只需遍历每个 bin 的列表

Now, here's a much simpler solution that just traverses a for each bin:现在,这是一个更简单的解决方案,它只为每个 bin 遍历a

bin_boundaries = sorted(b)
results = []
for low, high in zip(bin_boundaries[:-1], bin_boundaries[1:]):
    results.append([i for i in a if i > low and i <= high])
print(results)

This time, I didn't create the edge bins.这一次,我没有创建边缘箱。 Again, fix the > and <= to match the semantics you actually want at the edges.再次,修复><=以匹配您在边缘实际想要的语义。

That outer loop can also be turned into a list comprehension, to give you this nested list comprehension and a very compact solution:该外部循环也可以转换为列表推导,为您提供嵌套列表推导和非常紧凑的解决方案:

results = [
    [i for i in a if i > low and i <= high]
    for low, high in zip(bin_boundaries[:-1], bin_boundaries[1:])
]

From your example I understand you want that the first interval includes the boundaries ([646, 706]) while the others must include only the upper boundary (]706, 13661], ]13661, 21345]).从您的示例中,我了解到您希望第一个间隔包括边界([646, 706]),而其他间隔必须仅包括上边界(]706, 13661], ]13661, 21345)。

I here use the.index method and a for loop that considers the lower boundary of the first interval and excludes it for the others:我在这里使用 .index 方法和一个考虑第一个区间的下边界并为其他区间排除它的 for 循环:

lists_result = []

for i in range(len(b[:-1])):
    idx_inf = a.index(b[i])
    idx_sup = a.index(b[i+1])
    if i == 0:
        lists_result.append(a[idx_inf:idx_sup+1])
    else:
        lists_result.append(a[idx_inf+1:idx_sup+1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM