Map a specific string value to a specific color in matplotlib.pyplot.imshow()

Question

I have a pandas.dataframe that looks like this:

columns    0    1   2   3   4   5
           A    A   A   A   B   B
           B    B   B   C   C   D
           D    D   E   E   F   F

I want to plot this using pyplot.imshow() , specifying the following colormap:

color_dict = {
    "A": "#DA291E",
    "B": "#83DF39",
    "C": "#E8132d",
    "D": "#008933",
    "E": "#006CB3",
    "F": "#52BFEC"
}

If I was plotting a bar or a scatter I could just call with the argument color=a_list_of_colors but this doesn't work with imshow() .

Instead I need to call with a cmap but as far as I understand it isn´t possible to create a cmap where a specific color is mapped to a value.

This means I need to create a colormap like this:

    from matplotlib.colors import ListedColormap 

    _colors = ["#DA291E", "DA291E", "DA291E", "DA291E" 
               "#83DF39", "#83DF39", "#83DF39", "#83DF39", "#83DF39", #...and so on]
    cmap = ListedColormap(_colors, name="custom_cmap")

But is there a better way to go about this?

I thought I could implement above method but for some reason it doesn't work and I can't seem to figure out why.

I begin by creating a color_list based on a long series version of my df above and then convert that list to a colormap:

color_list = list(series.map(color_dict))
custom_cmap = ListedColormap(color_list, name="custom_cmap")

The long series basically looks like this:

A
A
A
A
B
B
B
B
B
C
#...and so on

The fifth element in my df is B and when I print custom_cmap.__dict__.colors[4] I get #83DF39 which corresponds with the string value B in my df . So the mapping is correct.

The problem occurs when I call plt.imshow() with cmap=custom_cmap as it doesn´t follow the cmap - some values get the wrong color.

My first thought was that I had messed up the order meaning that the color_list didn´t follow the order of the df but it does.

The df above contains 18 values and the color_list does too. The last value in the df is an F which means that the last color in the color_list should be #52BFEC , which it is.

Adding more code.

# Begin by converting strings to any number since plt.imshow() needs numbers
float_dict = {
    'A': 0.0,
    'B': 1.0,
    'C': 2.0,
    'D': 3.0,
    'E': 4.0,
    'F': 5.0,
    'G': 6.0,
    'H': 7.0,
    'I': 8.0
}

converted_series = series.map(float_dict).copy()

# Map each float to a specific color
color_dict = {
    0.0: '#DA291E',
    1.0: '#E7112d',
    2.0: '#83CD39',
    3.0: '#009934',
    4.0: '#007AB3',
    5.0: '#54BDEC',
    6.0: '#000066',
    7.0: '#DDDD11',
    8.0: '#572B84',
}

# Create a cmap from a color list
color_list = list(converted_series.map(color_dict))
custom_cmap = ListedColormap(color_list, name="custom_cmap")

# Widen the series into a df
df = series_to_wide_df(converted_series, n_columns=8)

# Plot it
plt.imshow(df, cmap=custom_cmap, interpolation='none')

The result of above is seen in image below.

Note that the data in this image is not the same the data in the df in the original post.

I tested a different color_dict :

color_dict = {
    0.0: '#FF0000',
    1.0: '#FF0000',
    2.0: '#FF0000',
    3.0: '#FF0000',
    4.0: '#FF0000',
    5.0: '#000000',
    6.0: '#000000',
    7.0: '#000000',
    8.0: '#000000'
}

But the colors still don't map correctly. With these colors, 1.0 , 2.0 , 6.0 , 7.0 and some 8.0 get the color red.

Answer 1

It's rather hard to see where the proposed code goes wrong without it being runnable by itself.

The following would create a dictionary mapping letters to numbers and apply it to the dataframe. Then it'll create a colormap with as many colors as there are (possible) values in the dataframe. Then plotting with imshow works fine when the colormap is normalized between zero and the number of elements in the colormap. (This normalization may just be useful if not all possible values actually occur in the specific dataframe to plot, eg in the case letters A and H would be missing.)

import numpy as np; np.random.seed(42)
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap


df = pd.DataFrame(np.random.choice(list("ABCDEFGH"), size=(8,8)))
print(df)

#mapping from letters to numbers
letter2num = dict(zip(list("ABCDEFGH"), np.arange(8)))
df2 = pd.DataFrame(np.array( [letter2num[i] for i in df.values.flat] ).reshape(df.shape))


#produce colormap with as many colors as there are unique values in df
colors = ["pink", "red", "violet", "blue", 
          "turquoise", "limegreen", "gold", "brown"]  # use hex colors here, if desired.
cmap = ListedColormap(colors)

fig, ax = plt.subplots()
ax.imshow(df2.values, vmin=0, vmax=len(cmap.colors), cmap=cmap)


for i in range(len(df2)):
    for j in range(len(df2.columns)):
        ax.text(j,i, df.values[i,j], ha="center", va="center")
plt.show()

Map a specific string value to a specific color in matplotlib.pyplot.imshow()

Question

1 answers

solution1
2 ACCPTED 2018-09-13 23:33:50

Map a specific string value to a specific color in matplotlib.pyplot.imshow()

Question

1 answers

solution1 2 ACCPTED 2018-09-13 23:33:50

solution1
2 ACCPTED 2018-09-13 23:33:50