简体   繁体   English

Seaborn 计数图与第二个轴与有序数据

[英]Seaborn countplot with second axis with ordered data

I am trying to create a countplot with a lineplot over it as practice for some data visualisation I will be doing in work.我正在尝试创建一个带有线图的计数图,作为我将在工作中进行的一些数据可视化的练习。 I am looking at the kickstarter data on kaggle Link here我在这里查看kaggle Link上的 kickstarter 数据

I run a countplot with a hue on the state of the project (successful, failed, canceled) and both of these are ordered我在项目状态(成功、失败、取消)上运行了一个带有色调的计数图,并且这两个都是有序的

filter_list = ['failed', 'successful', 'canceled']
df2 = df[df.state.isin(filter_list)]

fig = plt.gcf()
fig.set_size_inches( 16, 10)
sns.countplot(x='main_category', hue='state', data=df2, order = df2['main_category'].value_counts().index, 
              hue_order = df2['state'].value_counts().index)

This comes out as follows:结果如下: 在此处输入图片说明

I then create my second axis and add a lineplot然后我创建我的第二个轴并添加一个线图

fig, ax = plt.subplots()
fig.set_size_inches( 16, 10)

ax = sns.countplot(x='main_category', hue='state', data=df, ax=ax, order = df2['main_category'].value_counts().index, 
              hue_order = df2['state'].value_counts().index)

ax2 = ax.twinx()
sns.lineplot(x='main_category', y='backers', data=df2, ax =ax2)

But this changes the column labels as seen below:但这会更改列标签,如下所示: 在此处输入图片说明

It appears that the data is the same its just the order of columns is different.看起来数据是一样的,只是列的顺序不同。 I am still learning so my code may be inefficent or some of it redundant but any help would be appreciated.我仍在学习,所以我的代码可能效率低下或其中一些是多余的,但任何帮助将不胜感激。 The only other things are how df is created which is as follows:唯一的其他事情是如何创建 df ,如下所示:

import pandas as pd
import numpy as np
import seaborn as sns; sns.set(style="white", color_codes=True)
import matplotlib.pyplot as plt

df = pd.read_csv('ks.csv')
df = df.drop(['ID'], axis = 1)
df.head()

I don't think lineplot is what you are looking for.我不认为lineplot是你正在寻找的。 lineplot is supposed to be used with numeric data , not categorical. lineplot应该与数字数据一起使用,而不是分类数据 I'm even surprised this worked at all.我什至很惊讶这真的有效。

I think you are looking for pointplot instead我认为您正在寻找pointplot而不是

filter_list = ['failed', 'successful', 'canceled']
df2 = df[df.state.isin(filter_list)]
order = df2['main_category'].value_counts().index

fig = plt.figure()
ax1 = sns.countplot(x='main_category', hue='state', data=df2, order=order, 
              hue_order=filter_list)
ax2 = ax1.twinx()
sns.pointplot(x='main_category', y='backers', data=df2, ax=ax2, order=order)

在此处输入图片说明

Note that used like that, pointplot will show the average number of backers across categories.请注意,像这样使用, pointplot将显示跨类别的平均支持者数量。 If that's not what you want, you can pass another aggregation function using the estimator= paramater如果这不是您想要的,您可以使用estimator=参数传递另一个聚合函数

eg例如

sns.pointplot(x='main_category', y='backers', data=df2, ax=ax2, order=order, estimator=np.sum)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM