提高将实验室颜色列表映射到第二个实验室颜色列表的速度

Question

The problem I am having is that I have two lists that create a decently large loop that is extremely slow... 3.5 to 4 seconds slow. 我遇到的问题是，我有两个列表会创建一个相当大的循环，这非常慢……慢3.5至4秒。 I'm looking to improve that. 我正在寻求改善。 Both lists that I am using contain Lab Colors. 我正在使用的两个列表都包含Lab Colors。 The first list is a color palette, call it palette_colors . 第一个列表是一个调色板，将其称为palette_colors 。

The second list has individual lab colors that I am using to compare, call it query_colors . 第二个列表具有我用来比较的各个实验室颜色，将其称为query_colors 。

I loop through the second query_colors to compare each color within the list to each one of the colors in palette_colors list. 我遍历第二个query_colors以将列表中的每种颜色与palette_colors列表中的每种颜色进行比较。 From that we get a distance that is used to check if the color would fall within a certain threshold. 从中我们得到一个距离，该距离用于检查颜色是否落在某个阈值内。

The problem I am having is that, since palette_colors is a large list (about 300 items), and query_colors has around 100, it iterates around 30,000 times. 我遇到的问题是，由于palette_colors是一个很大的列表（大约300个项目），而query_colors大约有100个，因此它会迭代大约30,000次。

So the question is, how can this be improved to run much faster? 所以问题是，如何改进它才能更快地运行？

Here are some of my thoughts: 这是我的一些想法：

Parallel Processing: I tried to use parallel processing but either it wasn't the right context to use it in or I just didn't know what I was doing... I'm leaning towards not knowing what I was doing as the issue. 并行处理：我尝试使用并行处理，但要么不是使用并行处理的正确上下文，要么就是我不知道自己在做什么...我倾向于不知道自己在做什么。
Cache between processed hex values: My first thought was to cache the distance between color combinations, however, that doesn't help much because colors are very specific: FFFFFF != FFFFFE, even though they are visibly the same. 在已处理的十六进制值之间进行缓存：我的第一个想法是缓存颜色组合之间的距离，但这并没有太大帮助，因为颜色非常明确：FFFFFF！= FFFFFE，即使它们在外观上是相同的。
Initial Hex Lookup Cache: Another thought was to compare the hex values… if the hex values matched, then just return that match. 初始十六进制查找缓存：另一个想法是比较十六进制值……如果十六进制值匹配，则只需返回该匹配项即可。 However, the same problem exists that Idea 1 suffers from. 但是，想法1也存在相同的问题。
Numpy Arrays + Distance Function: Perhaps if there was a way to turn both lists into numpy arrays that only contain the Lab values, then compare each using the CIELAB2000 distance function? numpy 数组+距离函数：也许如果有一种方法可以将两个列表都转换为仅包含Lab值的numpy数组，然后使用CIELAB2000距离函数进行比较？

Here is my fully functioning script (make sure to install colormath): 这是我的功能齐全的脚本（确保安装colormath）：

from time import time
from colormath.color_diff import delta_e_cie2000
from colormath.color_objects import LabColor
from operator import itemgetter

# Helper function for timing
milli_time = lambda: int(round(time() * 1000))


# when merging similar colors, check to see how much of that color there is before merging
def map_colors(query_colors, max_dist=100):

    # Contains colors from the palette that are closest to each color
    close_colors = []

    # loop through colors that we want to map
    for color_to_compare in query_colors:

        # compare lab distance with palette colors
        closest = [check_distance(palette_color, color_to_compare, max_dist) for palette_color in palette_colors]

        # Remove "none" values
        closest = [c for c in closest if c is not None]

        # sort by distance (ascending)
        closest = sorted(closest, key=itemgetter('distance'))[:1][0]['hex']

        # Remove hash
        closest = closest.replace('#','').lower()

        # Add to main list of closest colors
        close_colors.append(closest)

    return close_colors


# Checks the distance betwen lab colors
def check_distance(color_1, color_2, max_dist):
    distance = delta_e_cie2000(color_1['lab'], color_2['lab'])
    if distance < max_dist:
        return {
            'hex': color_1['hex'], 
            'lab': color_1['lab'],
            'distance': distance
        }

# list of palette colors
# Stack overflow doesn't allow this many characters, 
# so you'll have to copy and past the color palette from this url:

# https://codepen.io/anon/pen/bvrwzE?editors=1010
palette_colors = [] # ^^^^

# list of colors to compare
query_colors = [{'lab': LabColor(lab_l=89.82760556495964,lab_a=-3.4924545681218055,lab_b=13.558600954011734)}, {'lab': LabColor(lab_l=2.014962108133794,lab_a=0.22811941599047703,lab_b=1.790011046195017)}, {'lab': LabColor(lab_l=40.39474520781096,lab_a=2.901069537563777,lab_b=11.280131535056025)}, {'lab': LabColor(lab_l=67.39662457756837,lab_a=-2.5976442408520706,lab_b=26.652254040495404)}, {'lab': LabColor(lab_l=32.389426017556374,lab_a=1.0164239936505115,lab_b=12.27627339551004)}, {'lab': LabColor(lab_l=55.13922546782179,lab_a=-1.435016766528352,lab_b=35.18742442417581)}, {'lab': LabColor(lab_l=73.96645091673257,lab_a=1.0198226618362005,lab_b=18.548230422095546)}, {'lab': LabColor(lab_l=44.90651839131053,lab_a=-1.4672716457064805,lab_b=18.154138443480683)}, {'lab': LabColor(lab_l=60.80488926260843,lab_a=-8.077128235007613,lab_b=16.719069040228884)}, {'lab': LabColor(lab_l=4.179197112322317,lab_a=3.642005050652555,lab_b=3.0407269339523646)}, {'lab': LabColor(lab_l=30.180289034511695,lab_a=1.7045267250474505,lab_b=28.01083333844222)}, {'lab': LabColor(lab_l=44.31005006010243,lab_a=-4.362010483995816,lab_b=18.432029645523528)}, {'lab': LabColor(lab_l=0.8423115373777676,lab_a=0.13906540788867494,lab_b=-0.3786920370309088)}, {'lab': LabColor(lab_l=52.12865600856179,lab_a=-0.5797000071502412,lab_b=31.8790459272144)}, {'lab': LabColor(lab_l=67.92970225276791,lab_a=-4.149165904914209,lab_b=33.253179101415256)}, {'lab': LabColor(lab_l=60.97889320274747,lab_a=3.338501380000247,lab_b=20.062676387837676)}, {'lab': LabColor(lab_l=2.593838857738689,lab_a=2.824229469131745,lab_b=2.704743489514988)}, {'lab': LabColor(lab_l=7.392989008245966,lab_a=9.59267973632079,lab_b=6.729836507330539)}, {'lab': LabColor(lab_l=98.10223593819727,lab_a=-1.3873907335449909,lab_b=4.897317053977535)}, {'lab': LabColor(lab_l=82.313865698896,lab_a=2.588499921779952,lab_b=2.5971717623187507)}, {'lab': LabColor(lab_l=28.371415683395696,lab_a=5.560367090545137,lab_b=0.6970013651421025)}, {'lab': LabColor(lab_l=41.300756170362206,lab_a=-1.8010193876651093,lab_b=5.122094973647007)}, {'lab': LabColor(lab_l=5.26507956373176,lab_a=4.548521840585698,lab_b=-0.8421897365563757)}, {'lab': LabColor(lab_l=60.53644890578005,lab_a=1.9353937585603886,lab_b=13.731983810148996)}, {'lab': LabColor(lab_l=18.50664175674912,lab_a=4.127558915370255,lab_b=1.5318785538835367)}, {'lab': LabColor(lab_l=46.121107041110534,lab_a=-4.738660301778608,lab_b=11.46208844171116)}, {'lab': LabColor(lab_l=35.096818879142134,lab_a=3.865379674380942,lab_b=8.636348905128832)}, {'lab': LabColor(lab_l=23.053962804968776,lab_a=1.7671822304096418,lab_b=2.044120086931378)}, {'lab': LabColor(lab_l=34.77343072376579,lab_a=-3.57662664587155,lab_b=9.259575358162131)}, {'lab': LabColor(lab_l=35.35931031618316,lab_a=5.074166825160403,lab_b=7.782881046177659)}, {'lab': LabColor(lab_l=21.404442965730887,lab_a=3.157463425084356,lab_b=18.391549176595827)}, {'lab': LabColor(lab_l=86.26486893959512,lab_a=4.032848274744483,lab_b=-8.58323099615992)}, {'lab': LabColor(lab_l=45.991759128676385,lab_a=0.491023915355826,lab_b=10.794889190806279)}, {'lab': LabColor(lab_l=8.10395281254021,lab_a=2.434569728945693,lab_b=12.18393849532981)}, {'lab': LabColor(lab_l=37.06003096203893,lab_a=1.8239118316595027,lab_b=25.900755157740306)}, {'lab': LabColor(lab_l=34.339870663873945,lab_a=4.98653095415319,lab_b=1.8327067580758416)}, {'lab': LabColor(lab_l=46.981747324933046,lab_a=5.292489697923786,lab_b=6.937195284587405)}, {'lab': LabColor(lab_l=35.813822728158144,lab_a=29.12172183663478,lab_b=31.259045232888216)}, {'lab': LabColor(lab_l=83.84664420563516,lab_a=4.076393227849973,lab_b=7.589758095027621)}, {'lab': LabColor(lab_l=4.862540354567976,lab_a=3.691877768850965,lab_b=4.065132741305494)}, {'lab': LabColor(lab_l=29.520608025204446,lab_a=15.21028328876109,lab_b=-1.9817725741452907)}, {'lab': LabColor(lab_l=2.9184863831701477,lab_a=3.1009055082606847,lab_b=2.374657313916806)}, {'lab': LabColor(lab_l=25.119337116801645,lab_a=6.36800573668811,lab_b=5.191791275068236)}, {'lab': LabColor(lab_l=32.49319565030376,lab_a=4.09934993369665,lab_b=4.837690385449466)}, {'lab': LabColor(lab_l=6.09612588470991,lab_a=9.66024466422727,lab_b=2.297265839425217)}, {'lab': LabColor(lab_l=32.607204509025415,lab_a=37.17700423170081,lab_b=11.087136268936316)}, {'lab': LabColor(lab_l=45.72621067797596,lab_a=4.995679962723376,lab_b=8.10305144884066)}, {'lab': LabColor(lab_l=15.182103174406642,lab_a=17.3648698250356,lab_b=16.351707883547945)}, {'lab': LabColor(lab_l=30.735504056893177,lab_a=20.749263489097476,lab_b=11.103091166084845)}, {'lab': LabColor(lab_l=47.58987428222485,lab_a=21.4969535181187,lab_b=24.91820246623675)}, {'lab': LabColor(lab_l=3.2817937526961423,lab_a=7.0384930526659755,lab_b=5.0447129238750605)}, {'lab': LabColor(lab_l=39.176664955904386,lab_a=7.001035374555848,lab_b=7.1369181820884915)}, {'lab': LabColor(lab_l=32.47219675839261,lab_a=-2.4501733403216597,lab_b=10.408787644368223)}, {'lab': LabColor(lab_l=8.87372837821,lab_a=-2.5643873356231834,lab_b=5.64931313305761)}, {'lab': LabColor(lab_l=1.742927713725976,lab_a=0.539611795069117,lab_b=-0.6652519493932862)}, {'lab': LabColor(lab_l=33.873919675420986,lab_a=5.764566965886092,lab_b=-17.964944971494113)}, {'lab': LabColor(lab_l=40.693479627397174,lab_a=6.595272818345682,lab_b=5.018268124660407)}, {'lab': LabColor(lab_l=88.60103885061399,lab_a=2.6126810949935186,lab_b=-2.945792185321894)}, {'lab': LabColor(lab_l=55.70462312256947,lab_a=6.028112199048142,lab_b=-13.056527815975972)}, {'lab': LabColor(lab_l=9.115995988538636,lab_a=31.807462808077545,lab_b=-35.11774548995232)}, {'lab': LabColor(lab_l=38.051505820085076,lab_a=34.8155573981796,lab_b=-18.475401488472354)}, {'lab': LabColor(lab_l=71.92703712306943,lab_a=-3.471403558562458,lab_b=-10.445020993962896)}, {'lab': LabColor(lab_l=26.243044230459148,lab_a=46.369628814522414,lab_b=34.6338595372704)}, {'lab': LabColor(lab_l=66.76005751735073,lab_a=20.035224514354134,lab_b=25.87283658575612)}, {'lab': LabColor(lab_l=63.60391924768574,lab_a=-2.891469413896064,lab_b=9.573769130513398)}, {'lab': LabColor(lab_l=41.24069266482021,lab_a=16.278878911463096,lab_b=9.759226052984914)}, {'lab': LabColor(lab_l=27.25079531257893,lab_a=24.94884066949429,lab_b=-48.598531002024316)}, {'lab': LabColor(lab_l=4.265814465219158,lab_a=10.473548710425703,lab_b=4.1174226612907985)}, {'lab': LabColor(lab_l=87.15090987843114,lab_a=7.229747311809753,lab_b=-14.635793427155486)}, {'lab': LabColor(lab_l=54.54311545632727,lab_a=8.647572834710072,lab_b=-18.893603550071546)}, {'lab': LabColor(lab_l=11.276968541214082,lab_a=18.169882892627108,lab_b=-30.249378295412065)}, {'lab': LabColor(lab_l=35.090989205367,lab_a=1.0233204899371962,lab_b=-0.3006113739771554)}, {'lab': LabColor(lab_l=2.9317972315881953,lab_a=0.8523516700251477,lab_b=0.29972821911726233)}, {'lab': LabColor(lab_l=42.71927233847029,lab_a=15.072870104265279,lab_b=-31.54622665459128)}, {'lab': LabColor(lab_l=1.622807369995023,lab_a=1.0292382494377224,lab_b=1.2173768955478448)}, {'lab': LabColor(lab_l=85.05833643040985,lab_a=1.955449992315006,lab_b=-9.91904370358645)}, {'lab': LabColor(lab_l=1.6648316964409666,lab_a=0.13905563573127222,lab_b=-0.37887416481000025)}, {'lab': LabColor(lab_l=53.47424677173646,lab_a=1.322931077791023,lab_b=-0.14670143432404803)}, {'lab': LabColor(lab_l=3.7059376097529935,lab_a=0.31588132922930057,lab_b=0.11051016676932868)}, {'lab': LabColor(lab_l=1.4885457056861533,lab_a=0.6786902325009586,lab_b=-1.043701149385401)}, {'lab': LabColor(lab_l=16.298330353761287,lab_a=0.4909724855400033,lab_b=3.125329071162186)}]


if __name__ == '__main__':

    start_time = milli_time()

    colors = map_colors(query_colors)

    print(colors)
    print('Script took', milli_time() - start_time, 'milliseconds to run.')

Answer 1

Instead of color object and list comprehension you can use the array function from color_diff_matrix on raw Lab values: 您可以对原始Lab值使用color_diff_matrix的数组函数来代替颜色对象和列表理解：

from colormath.color_diff_matrix import delta_e_cie2000

# Colors as raw Lab values
# Some test data
palette_colors = np.tile([ 2.01496211,  0.22811942,  1.79001105], [300, 1])
color_to_compare = np.array([ 89.82760556,  -3.49245457,  13.55860095])

dist = delta_e_cie2000(color_to_compare, palette_colors)
closest = palette_colors[np.argmin(dist)]  # also color as raw Lab components

This should already give a nice speedup, but I got another factor 5 by jitting the function with numba: 这应该已经可以提供不错的加速效果，但是我通过用numba将该函数添加到另一个参数中得到了5：

from colormath.color_diff_matrix import delta_e_cie2000
from numba import jit

delta_e_cie2000_jit = jit(delta_e_cie2000)
dist = delta_e_cie2000_jit(color_to_compare, palette_colors)
... # the rest is the same

Note that the first execution of the jitted function is slow due to the compilation process. 请注意，由于编译过程的原因，jitted函数的首次执行很慢。

Answer 2

This is an example of the nearest neighbor problem . 这是最近邻问题的一个例子。 The usual approach for an exact answer with many query points is based on space partitioning. 对于具有许多查询点的精确答案的常用方法是基于空间分区。 This would be easy for the CIE76 metric because it is Euclidean (in L*a*b* space): a kd tree can be used directly. 对于CIE76度量标准，这很容易，因为它是欧几里得数（在L * a * b *空间中）： kd树可以直接使用。

It's still possible to apply the space-partitioning approach with the newer, more complicated metrics. 仍然可以将空间划分方法与更新，更复杂的指标一起应用。 However, for an exact result you have to derive a bound on the distance between a point and whatever partitions you choose: that is, the minimum distance between a point and any point that would be on the other side. 但是，要获得准确的结果，您必须得出一个点与您选择的任何分区之间的距离的界限：即，一个点与另一侧的任何点之间的最小距离。 A coordinate-aligned plane (as used for the Euclidean case) makes for the simplest partitioning, but might make a poor bound. 坐标对齐的平面（用于欧几里得情况）可以使分区最简单，但边界可能较差。

If an approximate answer suffices, you can approximate the 2000 metric with the 76 one for the purposes of partitioning. 如果近似答案足够，则可以出于分区目的将2000度量标准与76度量标准近似。 You can also switch to a binning approach, where you round the coordinates onto a coarse grid and then search it in a structured fashion to find the closest match. 您还可以切换到装箱方法，在该方法中，将坐标四舍五入到一个粗略的网格，然后以结构化的方式搜索它以找到最接近的匹配项。 Each of these approaches often provides but cannot guarantee an exact result. 这些方法中的每一个通常都提供但不能保证准确的结果。

提高将实验室颜色列表映射到第二个实验室颜色列表的速度

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-03-25 11:41:58

解决方案2
0 2018-03-25 20:32:32

提高将实验室颜色列表映射到第二个实验室颜色列表的速度

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-03-25 11:41:58

解决方案2 0 2018-03-25 20:32:32

解决方案1
1 已采纳 2018-03-25 11:41:58

解决方案2
0 2018-03-25 20:32:32