简体   繁体   English

如何 plot 在 x 轴上有 2 个变量并在 y 轴上计数的图形?

[英]How to plot a graph with 2 variables on the x axis and count on the y axis?

I'm trying to plot a graph to show the number authorized and unauthorized distinct IP address attempts.我正在尝试 plot 一个图表来显示授权和未授权的不同 IP 地址尝试的数量。 The data I have looks something like this:我拥有的数据看起来像这样:

    Access Type     host/IP address Count
0   Authorized      206.196.21.129  23
1   Authorized      207.30.238.8    46
2   Authorized      208.62.55.75    23
3   Authorized      216.12.111.241  23
4   Authorized      63.197.98.106   23
5   Authorized      67.95.49.172    23
6   Unauthorized    207.243.167.114 23
7   Unauthorized    209.152.168.249 10
8   Unauthorized    65.166.159.14   10
9   Unauthorized    68.143.156.89   10

How do i go about doing it?我该怎么做呢? I am thinking that the X-axis will have the IP addresses as the main header and the count of the access types as the sub header.我在想 X 轴将有 IP 地址作为主要 header 和访问类型的计数作为子 header。

You can do something like that;你可以做这样的事情; In the following code I've colored "Unauthorized" IPs as RED and the "Authorized" IPs as GREEN.在下面的代码中,我将“未授权”IP 着色为红色,将“授权”IP 着色为绿色。 You can change that.你可以改变它。

import pandas as pd
import matplotlib.pyplot as plt


# df has data
colors = ['r' if item == "Unauthorized" else 'g' for item in df["Access Type"]]
df.plot(kind='bar', x='host/IP address', y='Count', color=colors, legend=False)
plt.show()

which produce something like that:产生类似的东西: 在此处输入图像描述

Here is a much elegant way in Python这是 Python 中一种非常优雅的方式

import numpy as np
import pandas as pd
from plotnine import *
%matplotlib inline
df = pd.read_csv('~/Downloads/Untitled spreadsheet - Sheet1.csv')
ggplot(df, aes(x='Access Type', y = "Count" ,fill = 'host address'))+ 
geom_bar(stat="identity",position="dodge")

IP 地址作为颜色

If you are not having plotnine, use如果您没有 plotnine,请使用

pip install plotnine

Here is another form of coloring, if you are interested in this.如果您对此感兴趣,这是另一种形式的着色。

ggplot(df, aes(x='host address', y = "Count" ,fill = 'Access Type'))+ 
geom_bar(stat="identity",position="dodge")+ 
theme(axis_text_x=element_text(angle=45))

在此处输入图像描述

I am a bit late to the party.我参加聚会有点晚了。 Since there are already some bar charts shown as a solution, what do you think about using a bubble plot?由于已经有一些条形图显示为解决方案,您如何看待使用气泡 plot?

在此处输入图像描述

# Visualizing 4-D mix data using bubble plots

#create custom legend
red_patch = mpatches.Patch(color='red', label='Unauthorized Access', alpha=0.4)
blue_patch = mpatches.Patch(color='blue', label='Authorized Access', alpha=0.4)

#specify figure size
plt.figure(figsize=(7,7))

#specify blubble size
size = df['Count']*25

#define fill and edge colors
fill_colors = ['red' if access_type=='Unauthorized' else 'blue' for access_type in 
list(df['Access Type'])]
edge_colors = ['red' if access_type=='red' else 'blue' for access_type in list(df['Access Type'])]

#create scatter plot
plt.scatter(df['host/IP address'], df['Count'], s=size, 
            alpha=0.4, color=fill_colors, edgecolors="black",)

#rotate axis titles, IP Adress will not fit on the axis without rotation
plt.xticks(rotation=90)

#set legend handles
plt.legend(handles=[red_patch, blue_patch])

# give y and x axis titles and plot title
plt.xlabel('Host/IP Address')
plt.ylabel('Access Type Counts')
[![enter image description here][1]][1]plt.title('Access Type Counts by Host/IP Address',y=1.05)

You are just giving sample data here, correct?您只是在这里提供示例数据,对吗? Can it happen that you have Unauthorized and Authorized Access for one IP Address?一个 IP 地址是否会发生未经授权和授权访问? Make sure your plot will be robust to such cases.确保您的 plot 能够应对此类情况。

This article gives nice ideas on how to visualize categorical variables本文就如何可视化分类变量提供了很好的想法

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM