简体   繁体   English

基于条件的多个箱线图

[英]Multiple boxplots based on conditions

I have a dataframe with two columns.我有一个带有两列的 dataframe。 The power column represents the power consumption of the system.功率列表示系统的功耗。 And the component_status column divide the data in two, based when the component is OFF or ON. component_status 列根据组件关闭或打开的时间将数据分为两部分。 When the values are 153 is when the component is ON and when the values are 150 the component is OFF.当值为 153 时组件处于开启状态,当值为 150 时组件处于关闭状态。

The result that I am looking for is to have a boxplot with three boxplots, using sns.boxplot .我正在寻找的结果是使用sns.boxplot得到一个包含三个箱线图的箱线图。 One is the power consumption with all the data, called "TOTAL".一个是包含所有数据的功耗,称为“TOTAL”。 The other two, the power consumption based if the component was OFF or ON, called "COMPONENT = ON" "COMPONENT = OFF".另外两个,基于组件关闭或打开的功耗,称为“COMPONENT = ON”“COMPONENT = OFF”。

The data frame example is as follows:数据框示例如下:

power|component_status |
 0.5 |       150       | 
 1.5 |       150       | 
 2.5 |       150       |
 0.3 |       153       |
 0.5 |       153       | 
 1.5 |       153       | 
 2.5 |       150       |
 0.3 |       153       |

thanks for the help.谢谢您的帮助。

Your first step is to build your data frame with the conditions.您的第一步是根据条件构建数据框。 There are a few ways to go about this.有几种方法可以go了解一下。

  1. Let's start with an initial df1 (dataframe #1) as you have given.让我们从您提供的初始df1 (数据帧 #1)开始。 Then, let's add a condition column to say "Total".然后,让我们添加一个condition列来表示“总计”。 You can use print(df1) to see what this looks like.您可以使用print(df1)来查看它的外观。
  2. Then let's copy that dataframe into df2 , and let's replace the conditions with the off/on criteria from the component_status .然后让我们将 dataframe 复制到df2中,并用component_status中的关闭/打开条件替换conditions
  3. Our final dataframe df is just a concatenation of df1 and df2 .我们最终的 dataframe df只是df1df2的串联。
  4. Now we have a dataframe df that is ready to go in Seaborn.现在我们有一个 dataframe df准备好到 Seaborn 中的 go。

Imports and DataFrame进口及DataFrame

# Set up
import pandas as pd
import numpy as np
import seaborn as sns

power = [0.5, 1.5, 2.5, 0.3, 0.5, 1.5, 2.5, 0.3]
component_status = [150, 150, 150, 153, 153, 153, 150, 153]
df1 = pd.DataFrame(
    data=zip(power, component_status), columns=["power", "component_status"]
)

# Step 1
df1["condition"] = "Total"
# print(df1)

# Step 2
df2 = df1.copy()

df2["condition"] = np.where(df2["component_status"] == 153, "On", "Off")

# If you have several criteria, it can be easier to use np.select
# ... or just use Pandas directly:
# df2.loc[(df2['component_status'] == 153), 'condition'] = 'On'
# df2.loc[(df2['component_status'] == 150), 'condition'] = 'Off'

### Step 3
df = pd.concat([df1,df2])

df view df视图

   power  component_status condition
0    0.5               150     Total
1    1.5               150     Total
2    2.5               150     Total
3    0.3               153     Total
4    0.5               153     Total
5    1.5               153     Total
6    2.5               150     Total
7    0.3               153     Total
0    0.5               150       Off
1    1.5               150       Off
2    2.5               150       Off
3    0.3               153        On
4    0.5               153        On
5    1.5               153        On
6    2.5               150       Off
7    0.3               153        On

Plotting绘图

# Step 4
ax = sns.boxplot(data=df, x='condition', y='power')

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM