简体   繁体   中英

Python Iris Dataset scatter plot error in code

My code keeps giving an error , And I am uncertain as to why this error is appearing.

Here is the code.

from itertools import permutations
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

'''downlaod iris.csv from https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv'''
#Load Iris.csv into pandas dataframe.
iris = pd.read_csv("iris.csv")

#Get all combinations of [1,2,3]
#and length 2
perm = permutations(["sepal_width", "sepal_length","petal_length","petal_width"],2)
import itertools

colors={'Iris-setosa':'red', 'Iris-versicolor':'blue', 'Iris-virginica':'green'}
#Print the obtained combinations
plt.figure(1)
k=1
for i in list(perm):
    #print(i)
    plt.subplot(4,3,k)
    plt.scatter(iris[i[0]],iris[i[1]],c=iris['species'].apply(lambda x: colors[x]),s=3)
    k+=1

plt.show()

and the error that comes with it.

Traceback (most recent call last):
  File "C:\Users\mulle\OneDrive\Desktop\IrisDatasetPlot.py", line 23, in <module>
    plt.scatter(iris[i[0]],iris[i[1]],c=iris['species'].apply(lambda x: colors[x]),s=3)
  File "C:\Users\mulle\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\series.py", line 3848, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\_libs\lib.pyx", line 2327, in pandas._libs.lib.map_infer
  File "C:\Users\mulle\OneDrive\Desktop\IrisDatasetPlot.py", line 23, in <lambda>
    plt.scatter(iris[i[0]],iris[i[1]],c=iris['species'].apply(lambda x: colors[x]),s=3)
KeyError: 'setosa'
>>> 

I do not understand why setosa is a key error.

I just downloaded the dataset. It looks like this

sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,setosa
4.9,3,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa

In the line c=iris['species'].apply(lambda x: colors[x]) you're applying the colors[x] to every series element. The series is species , that means x here means setosa (and also other two I believe). But obviously colors dictionary doesn't have "setosa" key. Hence the keyerror

Your data:

from sklearn.datasets import load_iris
from itertools import permutations
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

df = pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv")
#pd.DataFrame(data.data, columns=['sepal_length','sepal_width','petal_length','petal_width'])
df.head()

You can set your colors using map , and do it once outside the loop:

perm = list(permutations(df.columns[0:4],2))
colors={'setosa':'red', 'versicolor':'blue', 'virginica':'green'}
plt_col =df['species'].map(colors)

Then plot:

fig, axs = plt.subplots(4, 3)
fig.tight_layout()
for k in range(len(perm)):
    x_var,y_var = perm[k][0],perm[k][1]
    subplt_row = k % 4
    subplt_col = int(k/4)
    axs[subplt_row,subplt_col].scatter(df[x_var],df[y_var],c=plt_col,s=3)
    axs[subplt_row,subplt_col].set_title('v=1',size=7)
    axs[subplt_row,subplt_col].title.set_text(y_var+" vs "+x_var)
plt.show()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM