[英]Why the figure showed by matplotlib has more than three colors while I have only three labels?
I'm starting to learn to use matplotlib
to draw figures. 我开始学习使用
matplotlib
绘制图形。 When I was using the famous iris
dataset and trying to draw a plot figure, I encountered a question. 当我使用著名的
iris
数据集并尝试绘制绘图时,我遇到了一个问题。
import numpy as np
import pandas as pd
import matplotlib.pylab as pl
raw = pd.read_csv('iris.csv')
data = raw.values
print data
x = data[:,0]
y = data[:,1]
pl.scatter(x,y,color = ['r','g','b'], s = [30,40,50], alpha=0.5)
pl.figure()
pl.show()
labels = set(data[:,4])
print labels
I got the output 我得到了输出
...
[6.7 3.3 5.7 2.5 'Iris-virginica']
[6.7 3.0 5.2 2.3 'Iris-virginica']
[6.3 2.5 5.0 1.9 'Iris-virginica']
[6.5 3.0 5.2 2.0 'Iris-virginica']
[6.2 3.4 5.4 2.3 'Iris-virginica']
[5.9 3.0 5.1 1.8 'Iris-virginica']]
set(['Iris-virginica', 'Iris-setosa', 'Iris-versicolor'])
I only used the first two features because I didn't know whether it is possible to draw high dimensional figures. 我只使用了前两个功能,因为我不知道是否可以绘制高维图形。
This is the figure I got 这是我得到的数字
There were more than three colors while, you can see from the output, there were exactly three labels ('Iris-virginica', 'Iris-setosa', 'Iris-versicolor')
. 从输出中可以看到三种以上的颜色,而恰好有三种标签
('Iris-virginica', 'Iris-setosa', 'Iris-versicolor')
。
I wonder how does matplotlib decide what color to use? 我想知道matplotlib如何决定使用哪种颜色?
What are the different colors for? 有什么不同的颜色?
What should I do to show a three-color plot figure? 如何显示三色绘图?
You obtained this figure with pyplot.scatter , more specifically with this line of code: 您是通过pyplot.scatter获得此图的,更具体地说,是通过以下代码行获得的:
pl.scatter(x, y, color=['r','g','b'], s=[30,40,50], alpha=0.5)
In the line above, there is no indications whatsoever about labels. 在上面的行中,没有任何关于标签的指示。
x
and y
are only two list of numbers. x
和y
只是两个数字列表。
To color the dots, scatter
uses the argument color=['r', 'g', 'b']
. 为了给点着色,
scatter
点使用了参数color=['r', 'g', 'b']
。 If color
is the same size than x
and y
, then each dot has a defined color. 如果
color
与x
和y
大小相同,则每个点都有定义的颜色。 But if color
is smaller than x
and y
, then scatter
will loop through color
as many times as needed. 但是,如果
color
小于x
和y
,则scatter
将根据需要遍历color
多次。 For example: 例如:
x = [1, 2, 3, 4, 5]
color = ['r', 'g', 'b'] becomes ['r', 'g', 'b', 'r', 'g']
And for the last mystery "why is there more than three colors on the plots" , it's because the transparency alpha
is set to 0.5 (all colors are 50% transparent). 对于最后一个谜题“为什么绘图上不存在三种颜色” ,这是因为透明度
alpha
设置为0.5(所有颜色均为50%透明)。 Some of the data points have the same x
and y
coordinates, so the colors overlay, and it looks like there's more colors than red, green, blue. 一些数据点具有相同的
x
和y
坐标,因此颜色会重叠,并且看起来比红色,绿色,蓝色具有更多的颜色。
To plot the right colors, you need to use the labels informations. 要绘制正确的颜色,您需要使用标签信息。 Python scatter plot with colors corresponding to strings should help you.
Python散点图与字符串对应的颜色应该可以为您提供帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.