简体   繁体   中英

Pandas DataFrame plot, colors are not unique

According to Pandas manual , the parameter Colormap can be used to select colors from matplotlib colormap object. However for each bar, in the case of a bar diagram, the color needs to be selected manually. This is not capable, if you have a lot of bars, the manual effort is annoying. My expectation is that if no color is selected, each object/class should get a unique color representation. Unfortunately, this is not the case. The colors are repetitive. Only 10 unique colors are provided.

Code for reproduction:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0,100,size=(100, 25)), columns=list('ABCDEFGHIJKLMNOPQRSTUVWXY'))
df.set_index('A', inplace=True)
df.plot(kind='bar', stacked=True, figsize=(20, 10))
plt.title("some_name")
plt.savefig("some_name" + '.png')

Does somebody have any idea how to get a unique color for each class in the diagram? Thanks in advance

That's probably because the colors in the default property cycle ( see image below ) are only number of 10.

A workaround would be to set a list of random colors (in your case, 24) and pass it as a kwarg to pandas.DataFrame.bar :

import random

list_colors= ["#"+"".join([random.choice("0123456789ABCDEF") for j in range(6)])
              for i in range(len(df.columns))]

df.plot(kind="bar", stacked=True, figsize=(20, 10), color=list_colors)

在此处输入图像描述

Update:

It might be hard to find a palette of very distinct 24 colors. However, you can use one of the palettes available in seaborn:

在此处输入图像描述

import seaborn as sns #pip install seaborn

list_colors = sns.color_palette("hsv", n_colors=24)

df.plot(kind="bar", stacked=True, figsize=(20, 10), color=list_colors)

Another solution would be to use scipy.spatial.distance.euclidean from the beautiful :

from scipy.spatial import distance #pip install scipy

def hex_to_rgb(hex_color):
    return tuple(int(hex_color[i:i+2], 16) for i in (1, 3, 5))

def distinct_colors(n):
    colors = []
    while len(colors) < n:
        color = "#" + "".join(random.choice("0123456789ABCDEF") for _ in range(6))
        if all(distance.euclidean(hex_to_rgb(color), hex_to_rgb(c)) > 50 for c in colors):
            colors.append(color)
    return colors

colors = distinct_colors(len(df.columns)) #len(df.columns)=24
sns.palplot(colors)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM