简体   繁体   English

Pandas - seaborn lineplot 色调意想不到的传奇

[英]Pandas - seaborn lineplot hue unexpected legend

I have a data frame of client names, dates and transactions.我有一个包含客户名称、日期和交易的数据框。 I'm not sure how far back my error goes, so here is all the pre-processing I do:我不确定我的错误有多远,所以这是我所做的所有预处理:

data = pd.read_excel('Test.xls')
## convert to datetime object 
data['Date Order'] = pd.to_datetime(data['Date Order'], format = '%d.%m.%Y')
## add columns for month and year of each row for easier analysis later
data['month'] = data['Date Order'].dt.month
data['year'] = data['Date Order'].dt.year  

So the data frame becomes something like:所以数据框变成了这样:

Date Order           NameCustomers         SumOrder          month         year
2019-01-02 00:00:00   Customer 1             290              1            2019  
2019-02-02 00:00:00   Customer 1             50               2            2019  
----- 
2020-06-28 00:00:00   Customer 2             900              6            2020
------ 

..etc. ..等等。 You get the idea.你明白了。 Next I group by both month and year and calculate the mean.接下来我按月份和年份分组并计算平均值。

groupedMonthYearMean = data.groupby(['month', 'year'])['SumOrder'].mean().reset_index()

Output:输出:

month    year    SumOrder 
1        2019    233.08
1        2020    303.40
2        2019    255.34   
2        2020    842.24
--------------------------

I use the resulting dataframe to make a lineplot, which tracks the SumOrder for each month, and displays it for each year.我使用生成的数据框制作一个线图,它跟踪每个月的 SumOrder,并为每年显示它。

linechart = sns.lineplot(x = 'month', 
                         y = 'SumOrder', 
                         hue = 'year',
                         data = groupedMonthYearMean).set_title('Mean Sum Order by month')
plt.show()

I have attached a screenshot of the resulting plot - overall it seems to show what I expected to create.我附上了结果图的屏幕截图 - 总的来说,它似乎显示了我期望创建的内容。 In my entire data, the 'year' column has only two values: 2019 and 2020. For some reason, whatever I do, they show up as 0, -1 and -2.在我的整个数据中,'year' 列只有两个值:2019 和 2020。出于某种原因,无论我做什么,它们都显示为 0、-1 和 -2。 Any ideas what is going on?任何想法发生了什么?

意想不到的色调传奇

You want to change the dtype of the year column from int to category您想将 year 列的 dtype 从 int 更改为 category

df['year'] = df['year'].astype('category')

This is due to how hue treats ints.这是由于 Hue 如何处理整数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM