[英]How to plot a graph with 2 variables on the x axis and count on the y axis?
I'm trying to plot a graph to show the number authorized and unauthorized distinct IP address attempts.我正在尝试 plot 一个图表来显示授权和未授权的不同 IP 地址尝试的数量。 The data I have looks something like this:
我拥有的数据看起来像这样:
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
How do i go about doing it?我该怎么做呢? I am thinking that the X-axis will have the IP addresses as the main header and the count of the access types as the sub header.
我在想 X 轴将有 IP 地址作为主要 header 和访问类型的计数作为子 header。
You can do something like that;你可以做这样的事情; In the following code I've colored "Unauthorized" IPs as RED and the "Authorized" IPs as GREEN.
在下面的代码中,我将“未授权”IP 着色为红色,将“授权”IP 着色为绿色。 You can change that.
你可以改变它。
import pandas as pd
import matplotlib.pyplot as plt
# df has data
colors = ['r' if item == "Unauthorized" else 'g' for item in df["Access Type"]]
df.plot(kind='bar', x='host/IP address', y='Count', color=colors, legend=False)
plt.show()
Here is a much elegant way in Python这是 Python 中一种非常优雅的方式
import numpy as np
import pandas as pd
from plotnine import *
%matplotlib inline
df = pd.read_csv('~/Downloads/Untitled spreadsheet - Sheet1.csv')
ggplot(df, aes(x='Access Type', y = "Count" ,fill = 'host address'))+
geom_bar(stat="identity",position="dodge")
If you are not having plotnine, use如果您没有 plotnine,请使用
pip install plotnine
Here is another form of coloring, if you are interested in this.如果您对此感兴趣,这是另一种形式的着色。
ggplot(df, aes(x='host address', y = "Count" ,fill = 'Access Type'))+
geom_bar(stat="identity",position="dodge")+
theme(axis_text_x=element_text(angle=45))
I am a bit late to the party.我参加聚会有点晚了。 Since there are already some bar charts shown as a solution, what do you think about using a bubble plot?
由于已经有一些条形图显示为解决方案,您如何看待使用气泡 plot?
# Visualizing 4-D mix data using bubble plots
#create custom legend
red_patch = mpatches.Patch(color='red', label='Unauthorized Access', alpha=0.4)
blue_patch = mpatches.Patch(color='blue', label='Authorized Access', alpha=0.4)
#specify figure size
plt.figure(figsize=(7,7))
#specify blubble size
size = df['Count']*25
#define fill and edge colors
fill_colors = ['red' if access_type=='Unauthorized' else 'blue' for access_type in
list(df['Access Type'])]
edge_colors = ['red' if access_type=='red' else 'blue' for access_type in list(df['Access Type'])]
#create scatter plot
plt.scatter(df['host/IP address'], df['Count'], s=size,
alpha=0.4, color=fill_colors, edgecolors="black",)
#rotate axis titles, IP Adress will not fit on the axis without rotation
plt.xticks(rotation=90)
#set legend handles
plt.legend(handles=[red_patch, blue_patch])
# give y and x axis titles and plot title
plt.xlabel('Host/IP Address')
plt.ylabel('Access Type Counts')
[![enter image description here][1]][1]plt.title('Access Type Counts by Host/IP Address',y=1.05)
You are just giving sample data here, correct?您只是在这里提供示例数据,对吗? Can it happen that you have Unauthorized and Authorized Access for one IP Address?
一个 IP 地址是否会发生未经授权和授权访问? Make sure your plot will be robust to such cases.
确保您的 plot 能够应对此类情况。
This article gives nice ideas on how to visualize categorical variables本文就如何可视化分类变量提供了很好的想法
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.