简体   繁体   English

提高将实验室颜色列表映射到第二个实验室颜色列表的速度

[英]Improve the Speed of Mapping A List of Lab Colors to a Second List of Lab Colors

The problem I am having is that I have two lists that create a decently large loop that is extremely slow... 3.5 to 4 seconds slow. 我遇到的问题是,我有两个列表会创建一个相当大的循环,这非常慢……慢3.5至4秒。 I'm looking to improve that. 我正在寻求改善。 Both lists that I am using contain Lab Colors. 我正在使用的两个列表都包含Lab Colors。 The first list is a color palette, call it palette_colors . 第一个列表是一个调色板,将其称为palette_colors

The second list has individual lab colors that I am using to compare, call it query_colors . 第二个列表具有我用来比较的各个实验室颜色,将其称为query_colors

I loop through the second query_colors to compare each color within the list to each one of the colors in palette_colors list. 我遍历第二个query_colors以将列表中的每种颜色与palette_colors列表中的每种颜色进行比较。 From that we get a distance that is used to check if the color would fall within a certain threshold. 从中我们得到一个距离,该距离用于检查颜色是否落在某个阈值内。

The problem I am having is that, since palette_colors is a large list (about 300 items), and query_colors has around 100, it iterates around 30,000 times. 我遇到的问题是,由于palette_colors是一个很大的列表(大约300个项目),而query_colors大约有100个,因此它会迭代大约30,000次。

So the question is, how can this be improved to run much faster? 所以问题是,如何改进它才能更快地运行?


Here are some of my thoughts: 这是我的一些想法:

  1. Parallel Processing: I tried to use parallel processing but either it wasn't the right context to use it in or I just didn't know what I was doing... I'm leaning towards not knowing what I was doing as the issue. 并行处理:我尝试使用并行处理,但要么不是使用并行处理的正确上下文,要么就是我不知道自己在做什么...我倾向于不知道自己在做什么。

  2. Cache between processed hex values: My first thought was to cache the distance between color combinations, however, that doesn't help much because colors are very specific: FFFFFF != FFFFFE, even though they are visibly the same. 在已处理的十六进制值之间进行缓存:我的第一个想法是缓存颜色组合之间的距离,但这并没有太大帮助,因为颜色非常明确:FFFFFF!= FFFFFE,即使它们在外观上是相同的。

  3. Initial Hex Lookup Cache: Another thought was to compare the hex values… if the hex values matched, then just return that match. 初始十六进制查找缓存:另一个想法是比较十六进制值……如果十六进制值匹配,则只需返回该匹配项即可。 However, the same problem exists that Idea 1 suffers from. 但是,想法1也存在相同的问题。

  4. Numpy Arrays + Distance Function: Perhaps if there was a way to turn both lists into numpy arrays that only contain the Lab values, then compare each using the CIELAB2000 distance function? numpy 数组+距离函数:也许如果有一种方法可以将两个列表都转换为仅包含Lab值的numpy数组,然后使用CIELAB2000距离函数进行比较?

Here is my fully functioning script (make sure to install colormath): 这是我的功能齐全的脚本(确保安装colormath):

from time import time
from colormath.color_diff import delta_e_cie2000
from colormath.color_objects import LabColor
from operator import itemgetter

# Helper function for timing
milli_time = lambda: int(round(time() * 1000))


# when merging similar colors, check to see how much of that color there is before merging
def map_colors(query_colors, max_dist=100):

    # Contains colors from the palette that are closest to each color
    close_colors = []

    # loop through colors that we want to map
    for color_to_compare in query_colors:

        # compare lab distance with palette colors
        closest = [check_distance(palette_color, color_to_compare, max_dist) for palette_color in palette_colors]

        # Remove "none" values
        closest = [c for c in closest if c is not None]

        # sort by distance (ascending)
        closest = sorted(closest, key=itemgetter('distance'))[:1][0]['hex']

        # Remove hash
        closest = closest.replace('#','').lower()

        # Add to main list of closest colors
        close_colors.append(closest)

    return close_colors


# Checks the distance betwen lab colors
def check_distance(color_1, color_2, max_dist):
    distance = delta_e_cie2000(color_1['lab'], color_2['lab'])
    if distance < max_dist:
        return {
            'hex': color_1['hex'], 
            'lab': color_1['lab'],
            'distance': distance
        }

# list of palette colors
# Stack overflow doesn't allow this many characters, 
# so you'll have to copy and past the color palette from this url:

# https://codepen.io/anon/pen/bvrwzE?editors=1010
palette_colors = [] # ^^^^

# list of colors to compare
query_colors = [{'lab': LabColor(lab_l=89.82760556495964,lab_a=-3.4924545681218055,lab_b=13.558600954011734)}, {'lab': LabColor(lab_l=2.014962108133794,lab_a=0.22811941599047703,lab_b=1.790011046195017)}, {'lab': LabColor(lab_l=40.39474520781096,lab_a=2.901069537563777,lab_b=11.280131535056025)}, {'lab': LabColor(lab_l=67.39662457756837,lab_a=-2.5976442408520706,lab_b=26.652254040495404)}, {'lab': LabColor(lab_l=32.389426017556374,lab_a=1.0164239936505115,lab_b=12.27627339551004)}, {'lab': LabColor(lab_l=55.13922546782179,lab_a=-1.435016766528352,lab_b=35.18742442417581)}, {'lab': LabColor(lab_l=73.96645091673257,lab_a=1.0198226618362005,lab_b=18.548230422095546)}, {'lab': LabColor(lab_l=44.90651839131053,lab_a=-1.4672716457064805,lab_b=18.154138443480683)}, {'lab': LabColor(lab_l=60.80488926260843,lab_a=-8.077128235007613,lab_b=16.719069040228884)}, {'lab': LabColor(lab_l=4.179197112322317,lab_a=3.642005050652555,lab_b=3.0407269339523646)}, {'lab': LabColor(lab_l=30.180289034511695,lab_a=1.7045267250474505,lab_b=28.01083333844222)}, {'lab': LabColor(lab_l=44.31005006010243,lab_a=-4.362010483995816,lab_b=18.432029645523528)}, {'lab': LabColor(lab_l=0.8423115373777676,lab_a=0.13906540788867494,lab_b=-0.3786920370309088)}, {'lab': LabColor(lab_l=52.12865600856179,lab_a=-0.5797000071502412,lab_b=31.8790459272144)}, {'lab': LabColor(lab_l=67.92970225276791,lab_a=-4.149165904914209,lab_b=33.253179101415256)}, {'lab': LabColor(lab_l=60.97889320274747,lab_a=3.338501380000247,lab_b=20.062676387837676)}, {'lab': LabColor(lab_l=2.593838857738689,lab_a=2.824229469131745,lab_b=2.704743489514988)}, {'lab': LabColor(lab_l=7.392989008245966,lab_a=9.59267973632079,lab_b=6.729836507330539)}, {'lab': LabColor(lab_l=98.10223593819727,lab_a=-1.3873907335449909,lab_b=4.897317053977535)}, {'lab': LabColor(lab_l=82.313865698896,lab_a=2.588499921779952,lab_b=2.5971717623187507)}, {'lab': LabColor(lab_l=28.371415683395696,lab_a=5.560367090545137,lab_b=0.6970013651421025)}, {'lab': LabColor(lab_l=41.300756170362206,lab_a=-1.8010193876651093,lab_b=5.122094973647007)}, {'lab': LabColor(lab_l=5.26507956373176,lab_a=4.548521840585698,lab_b=-0.8421897365563757)}, {'lab': LabColor(lab_l=60.53644890578005,lab_a=1.9353937585603886,lab_b=13.731983810148996)}, {'lab': LabColor(lab_l=18.50664175674912,lab_a=4.127558915370255,lab_b=1.5318785538835367)}, {'lab': LabColor(lab_l=46.121107041110534,lab_a=-4.738660301778608,lab_b=11.46208844171116)}, {'lab': LabColor(lab_l=35.096818879142134,lab_a=3.865379674380942,lab_b=8.636348905128832)}, {'lab': LabColor(lab_l=23.053962804968776,lab_a=1.7671822304096418,lab_b=2.044120086931378)}, {'lab': LabColor(lab_l=34.77343072376579,lab_a=-3.57662664587155,lab_b=9.259575358162131)}, {'lab': LabColor(lab_l=35.35931031618316,lab_a=5.074166825160403,lab_b=7.782881046177659)}, {'lab': LabColor(lab_l=21.404442965730887,lab_a=3.157463425084356,lab_b=18.391549176595827)}, {'lab': LabColor(lab_l=86.26486893959512,lab_a=4.032848274744483,lab_b=-8.58323099615992)}, {'lab': LabColor(lab_l=45.991759128676385,lab_a=0.491023915355826,lab_b=10.794889190806279)}, {'lab': LabColor(lab_l=8.10395281254021,lab_a=2.434569728945693,lab_b=12.18393849532981)}, {'lab': LabColor(lab_l=37.06003096203893,lab_a=1.8239118316595027,lab_b=25.900755157740306)}, {'lab': LabColor(lab_l=34.339870663873945,lab_a=4.98653095415319,lab_b=1.8327067580758416)}, {'lab': LabColor(lab_l=46.981747324933046,lab_a=5.292489697923786,lab_b=6.937195284587405)}, {'lab': LabColor(lab_l=35.813822728158144,lab_a=29.12172183663478,lab_b=31.259045232888216)}, {'lab': LabColor(lab_l=83.84664420563516,lab_a=4.076393227849973,lab_b=7.589758095027621)}, {'lab': LabColor(lab_l=4.862540354567976,lab_a=3.691877768850965,lab_b=4.065132741305494)}, {'lab': LabColor(lab_l=29.520608025204446,lab_a=15.21028328876109,lab_b=-1.9817725741452907)}, {'lab': LabColor(lab_l=2.9184863831701477,lab_a=3.1009055082606847,lab_b=2.374657313916806)}, {'lab': LabColor(lab_l=25.119337116801645,lab_a=6.36800573668811,lab_b=5.191791275068236)}, {'lab': LabColor(lab_l=32.49319565030376,lab_a=4.09934993369665,lab_b=4.837690385449466)}, {'lab': LabColor(lab_l=6.09612588470991,lab_a=9.66024466422727,lab_b=2.297265839425217)}, {'lab': LabColor(lab_l=32.607204509025415,lab_a=37.17700423170081,lab_b=11.087136268936316)}, {'lab': LabColor(lab_l=45.72621067797596,lab_a=4.995679962723376,lab_b=8.10305144884066)}, {'lab': LabColor(lab_l=15.182103174406642,lab_a=17.3648698250356,lab_b=16.351707883547945)}, {'lab': LabColor(lab_l=30.735504056893177,lab_a=20.749263489097476,lab_b=11.103091166084845)}, {'lab': LabColor(lab_l=47.58987428222485,lab_a=21.4969535181187,lab_b=24.91820246623675)}, {'lab': LabColor(lab_l=3.2817937526961423,lab_a=7.0384930526659755,lab_b=5.0447129238750605)}, {'lab': LabColor(lab_l=39.176664955904386,lab_a=7.001035374555848,lab_b=7.1369181820884915)}, {'lab': LabColor(lab_l=32.47219675839261,lab_a=-2.4501733403216597,lab_b=10.408787644368223)}, {'lab': LabColor(lab_l=8.87372837821,lab_a=-2.5643873356231834,lab_b=5.64931313305761)}, {'lab': LabColor(lab_l=1.742927713725976,lab_a=0.539611795069117,lab_b=-0.6652519493932862)}, {'lab': LabColor(lab_l=33.873919675420986,lab_a=5.764566965886092,lab_b=-17.964944971494113)}, {'lab': LabColor(lab_l=40.693479627397174,lab_a=6.595272818345682,lab_b=5.018268124660407)}, {'lab': LabColor(lab_l=88.60103885061399,lab_a=2.6126810949935186,lab_b=-2.945792185321894)}, {'lab': LabColor(lab_l=55.70462312256947,lab_a=6.028112199048142,lab_b=-13.056527815975972)}, {'lab': LabColor(lab_l=9.115995988538636,lab_a=31.807462808077545,lab_b=-35.11774548995232)}, {'lab': LabColor(lab_l=38.051505820085076,lab_a=34.8155573981796,lab_b=-18.475401488472354)}, {'lab': LabColor(lab_l=71.92703712306943,lab_a=-3.471403558562458,lab_b=-10.445020993962896)}, {'lab': LabColor(lab_l=26.243044230459148,lab_a=46.369628814522414,lab_b=34.6338595372704)}, {'lab': LabColor(lab_l=66.76005751735073,lab_a=20.035224514354134,lab_b=25.87283658575612)}, {'lab': LabColor(lab_l=63.60391924768574,lab_a=-2.891469413896064,lab_b=9.573769130513398)}, {'lab': LabColor(lab_l=41.24069266482021,lab_a=16.278878911463096,lab_b=9.759226052984914)}, {'lab': LabColor(lab_l=27.25079531257893,lab_a=24.94884066949429,lab_b=-48.598531002024316)}, {'lab': LabColor(lab_l=4.265814465219158,lab_a=10.473548710425703,lab_b=4.1174226612907985)}, {'lab': LabColor(lab_l=87.15090987843114,lab_a=7.229747311809753,lab_b=-14.635793427155486)}, {'lab': LabColor(lab_l=54.54311545632727,lab_a=8.647572834710072,lab_b=-18.893603550071546)}, {'lab': LabColor(lab_l=11.276968541214082,lab_a=18.169882892627108,lab_b=-30.249378295412065)}, {'lab': LabColor(lab_l=35.090989205367,lab_a=1.0233204899371962,lab_b=-0.3006113739771554)}, {'lab': LabColor(lab_l=2.9317972315881953,lab_a=0.8523516700251477,lab_b=0.29972821911726233)}, {'lab': LabColor(lab_l=42.71927233847029,lab_a=15.072870104265279,lab_b=-31.54622665459128)}, {'lab': LabColor(lab_l=1.622807369995023,lab_a=1.0292382494377224,lab_b=1.2173768955478448)}, {'lab': LabColor(lab_l=85.05833643040985,lab_a=1.955449992315006,lab_b=-9.91904370358645)}, {'lab': LabColor(lab_l=1.6648316964409666,lab_a=0.13905563573127222,lab_b=-0.37887416481000025)}, {'lab': LabColor(lab_l=53.47424677173646,lab_a=1.322931077791023,lab_b=-0.14670143432404803)}, {'lab': LabColor(lab_l=3.7059376097529935,lab_a=0.31588132922930057,lab_b=0.11051016676932868)}, {'lab': LabColor(lab_l=1.4885457056861533,lab_a=0.6786902325009586,lab_b=-1.043701149385401)}, {'lab': LabColor(lab_l=16.298330353761287,lab_a=0.4909724855400033,lab_b=3.125329071162186)}]


if __name__ == '__main__':

    start_time = milli_time()

    colors = map_colors(query_colors)

    print(colors)
    print('Script took', milli_time() - start_time, 'milliseconds to run.')

Instead of color object and list comprehension you can use the array function from color_diff_matrix on raw Lab values: 您可以对原始Lab值使用color_diff_matrix的数组函数来代替颜色对象和列表理解:

from colormath.color_diff_matrix import delta_e_cie2000

# Colors as raw Lab values
# Some test data
palette_colors = np.tile([ 2.01496211,  0.22811942,  1.79001105], [300, 1])
color_to_compare = np.array([ 89.82760556,  -3.49245457,  13.55860095])

dist = delta_e_cie2000(color_to_compare, palette_colors)
closest = palette_colors[np.argmin(dist)]  # also color as raw Lab components

This should already give a nice speedup, but I got another factor 5 by jitting the function with numba: 这应该已经可以提供不错的加速效果,但是我通过用numba将该函数添加到另一个参数中得到了5:

from colormath.color_diff_matrix import delta_e_cie2000
from numba import jit

delta_e_cie2000_jit = jit(delta_e_cie2000)
dist = delta_e_cie2000_jit(color_to_compare, palette_colors)
... # the rest is the same

Note that the first execution of the jitted function is slow due to the compilation process. 请注意,由于编译过程的原因,jitted函数的首次执行很慢。

This is an example of the nearest neighbor problem . 这是最近邻问题的一个例子。 The usual approach for an exact answer with many query points is based on space partitioning. 对于具有许多查询点的精确答案的常用方法是基于空间分区。 This would be easy for the CIE76 metric because it is Euclidean (in L*a*b* space): a kd tree can be used directly. 对于CIE76度量标准,这很容易,因为它是欧几里得数(在L * a * b *空间中): kd树可以直接使用。

It's still possible to apply the space-partitioning approach with the newer, more complicated metrics. 仍然可以将空间划分方法与更新,更复杂的指标一起应用。 However, for an exact result you have to derive a bound on the distance between a point and whatever partitions you choose: that is, the minimum distance between a point and any point that would be on the other side. 但是,要获得准确的结果,您必须得出一个点与您选择的任何分区之间的距离的界限:即,一个点与另一侧的任何点之间的最小距离。 A coordinate-aligned plane (as used for the Euclidean case) makes for the simplest partitioning, but might make a poor bound. 坐标对齐的平面(用于欧几里得情况)可以使分区最简单,但边界可能较差。

If an approximate answer suffices, you can approximate the 2000 metric with the 76 one for the purposes of partitioning. 如果近似答案足够,则可以出于分区目的将2000度量标准与76度量标准近似。 You can also switch to a binning approach, where you round the coordinates onto a coarse grid and then search it in a structured fashion to find the closest match. 您还可以切换到装箱方法,在该方法中,将坐标四舍五入到一个粗略的网格,然后以结构化的方式搜索它以找到最接近的匹配项。 Each of these approaches often provides but cannot guarantee an exact result. 这些方法中的每一个通常都提供但不能保证准确的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM