简体   繁体   中英

Why seaborn with displot irregular

I used the script:

sns.displot(data=df, x='New Category', height=5, aspect=3, kde=True)

but the data not irregular like this pict I want the order to be like this::

  • Less than 2 hours
  • Between 1 to 2 hours
  • Between 2 to 4 hours
  • Between 4 to 6 hours
  • Between 6 to 12 hours
  • More than 12 hours

The Result of Script:

脚本的结果

The easiest way to fix an order, is via pd.Categorical :

from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

# first, create some test data
categories = ['Less than 2 hours', 'Between 1 to 2 hours', 'Between 2 to 4 hours',
              'Between 4 to 6 hours', 'Between 6 to 12 hours', 'More than 12 hours']
weights = np.random.rand(len(categories)) + 0.1
weights /= weights.sum()
df = pd.DataFrame({'New Category': np.random.choice(categories, 1000, p=weights)})

# fix an order on the column via pd.Categorical
df['New Category'] = pd.Categorical(df['New Category'], categories=categories, ordered=True)

# displot now uses the fixed order 
sns.displot(data=df, x='New Category', height=5, aspect=3, kde=True)
plt.show()

修复 sns.displot 的订单

The reason is the order in the original df:

import pandas as pd
df = pd.DataFrame({'test': ['Less than 2 hours', 'Less than 2 hours', 'Less than 2 hours', 'Less than 2 hours', 'Between 1 to 2 hours', 'Between 2 to 4 hours', 'Between 4 to 6 hours', 'Between 6 to 12 hours', 'More than 12 hours', 'More than 12 hours']})
sns.displot(data=df, x='test', height=5, aspect=3, kde=True)

result:

在此处输入图像描述

While:

import pandas as pd
df = pd.DataFrame({'test': ['Less than 2 hours', 'Less than 2 hours', 'Less than 2 hours', 'Less than 2 hours', 'Between 1 to 2 hours', 'Between 2 to 4 hours', 'Between 4 to 6 hours','More than 12 hours', 'More than 12 hours', 'Between 6 to 12 hours']})
sns.displot(data=df, x='test', height=5, aspect=3, kde=True)

result:

在此处输入图像描述

so, use:

mapping = {'Less than 2 hours': 0, 'Between 1 to 2 hours':1, 'Between 2 to 4 hours': 2, 'Between 4 to 6 hours': 3, 'Between 6 to 12 hours': 4, 'More than 12 hours': 5}
out = []
for val in df['test']:
    out.append(mapping[val])
df['ord'] = out
df = df.sort_values('ord')
sns.displot(data=df, x='test', height=5, aspect=3, kde=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM