简体   繁体   English

按类别变量分组框线图

[英]Grouped Boxplots by Categorical Variable

Using a pandas for a large dataset which I have already reduced down to the info I need. 将熊猫用于大型数据集,而我已经将其简化为所需的信息。 Basically I would like to plot the distribution of number of friends for users from two different countries as side-by-side boxplots (what I'm referring to as grouped boxplots), by number of hashtags used in their post (range from 1-6, I'm treating this as a categorical variable). 基本上,我想绘制两个不同国家/地区的用户的并列箱形图(我指的是成组箱形图)的朋友数量分布,并根据他们帖子中使用的主题标签数(范围从1- 6,我将其视为分类变量)。 This results in a total of 2*6=12 boxplots all in the same frame for easy comparison. 这样一来,总共2 * 6 = 12个箱形图全部位于同一帧中,以便于比较。

I've done some research and I'm aware of df.boxplot(by='x'), but this doesn't account for the extra level of comparing the two countries. 我已经做过一些研究,并且知道df.boxplot(by ='x'),但这并不能说明比较这两个国家的额外水平。

The dataset has columns for number of hashtags (int), country (string), number of friends (int). 数据集包含用于标签数(int),国家(字符串),朋友数(int)的列。

It's good to note that I'm fairly new to graphing in Python, including things like axes and subplots, so please include some extra info in your answer if possible. 值得一提的是,我对使用Python进行绘图还是相当陌生,包括轴和子图等内容,因此请尽可能在答案中包含一些额外信息。

Edit: small sample of dataset 编辑:数据集的小样本

       #followers  #friends  #mentions  #hashtags  country  lang_user place  
450            53        71          1          0       ja         es   NaN  
489            54        34          1          1       ja         es   NaN  
867          1569      1999          0          0       en         es   NaN  
1021          224       242          0          3       ja         ja   NaN  
1022          377       506          1          5       ja         ja   NaN  
1023          315       305          0          2       ja         ja   NaN

I like using seaborn for this kind of visualizations. 我喜欢使用seaborn进行这种可视化。 I guessthe "extra level" you mean is called "hue". 我猜你的意思是“额外水平”被称为“色相”。

import seaborn as sns
sns.set_style("whitegrid")
tips = sns.load_dataset("tips")
ax = sns.boxplot(x="day", y="total_bill", hue="smoker",              
data=tips, palette="Set3")

and the result would be: 结果将是: 在此处输入图片说明

check out this documentation: https://seaborn.pydata.org/generated/seaborn.boxplot.html 查看此文档: https : //seaborn.pydata.org/genic/seaborn.boxplot.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫和海豹的分组箱形图 - Grouped boxplots in pandas and seaborn 如何自定义分组箱线图的大小 - How to customise size of grouped boxplots Python 代码使 plot 按分类变量(性别)分组 - Python code to make bar plot grouped by categorical variable (gender) 按日期问题分组的分类变量的二进制矢量化编码 - Binary Vectorization Encoding for categorical variable grouped by date issue 如何使用框图使用Bokeh绘制分类数据? - How to plot categorical data with Bokeh using boxplots? 使用seaborn在一个绘图窗口中绘制所有分类变量的多个箱线图? - Multiple boxplots of all categorical variables in one plotting window using seaborn? 是否有一种使用Python Matplotlib在分组箱图上显示样本大小的好方法 - Is there a good way to display sample size on grouped boxplots using Python Matplotlib 如何将组合组的箱线图添加到 Seaborn 的分组箱线图中? - How to add a boxplot of combined groups into the plot of grouped boxplots in Seaborn? 如何使用来自不同列的分组数据并排绘制箱线图 - How to plot side by side boxplots with grouped data from different columns 非分类数据框,用于海底绘图箱图,sw mpl,泥火等的分类数据 - non-categorical dataframe to categorical data for seaborn plotting boxplots, swarmplots, stripplots etc
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM