简体   繁体   中英

How to plot a graph with 2 variables on the x axis and count on the y axis?

I'm trying to plot a graph to show the number authorized and unauthorized distinct IP address attempts. The data I have looks something like this:

    Access Type     host/IP address Count
0   Authorized      206.196.21.129  23
1   Authorized      207.30.238.8    46
2   Authorized      208.62.55.75    23
3   Authorized      216.12.111.241  23
4   Authorized      63.197.98.106   23
5   Authorized      67.95.49.172    23
6   Unauthorized    207.243.167.114 23
7   Unauthorized    209.152.168.249 10
8   Unauthorized    65.166.159.14   10
9   Unauthorized    68.143.156.89   10

How do i go about doing it? I am thinking that the X-axis will have the IP addresses as the main header and the count of the access types as the sub header.

You can do something like that; In the following code I've colored "Unauthorized" IPs as RED and the "Authorized" IPs as GREEN. You can change that.

import pandas as pd
import matplotlib.pyplot as plt


# df has data
colors = ['r' if item == "Unauthorized" else 'g' for item in df["Access Type"]]
df.plot(kind='bar', x='host/IP address', y='Count', color=colors, legend=False)
plt.show()

which produce something like that: 在此处输入图像描述

Here is a much elegant way in Python

import numpy as np
import pandas as pd
from plotnine import *
%matplotlib inline
df = pd.read_csv('~/Downloads/Untitled spreadsheet - Sheet1.csv')
ggplot(df, aes(x='Access Type', y = "Count" ,fill = 'host address'))+ 
geom_bar(stat="identity",position="dodge")

IP 地址作为颜色

If you are not having plotnine, use

pip install plotnine

Here is another form of coloring, if you are interested in this.

ggplot(df, aes(x='host address', y = "Count" ,fill = 'Access Type'))+ 
geom_bar(stat="identity",position="dodge")+ 
theme(axis_text_x=element_text(angle=45))

在此处输入图像描述

I am a bit late to the party. Since there are already some bar charts shown as a solution, what do you think about using a bubble plot?

在此处输入图像描述

# Visualizing 4-D mix data using bubble plots

#create custom legend
red_patch = mpatches.Patch(color='red', label='Unauthorized Access', alpha=0.4)
blue_patch = mpatches.Patch(color='blue', label='Authorized Access', alpha=0.4)

#specify figure size
plt.figure(figsize=(7,7))

#specify blubble size
size = df['Count']*25

#define fill and edge colors
fill_colors = ['red' if access_type=='Unauthorized' else 'blue' for access_type in 
list(df['Access Type'])]
edge_colors = ['red' if access_type=='red' else 'blue' for access_type in list(df['Access Type'])]

#create scatter plot
plt.scatter(df['host/IP address'], df['Count'], s=size, 
            alpha=0.4, color=fill_colors, edgecolors="black",)

#rotate axis titles, IP Adress will not fit on the axis without rotation
plt.xticks(rotation=90)

#set legend handles
plt.legend(handles=[red_patch, blue_patch])

# give y and x axis titles and plot title
plt.xlabel('Host/IP Address')
plt.ylabel('Access Type Counts')
[![enter image description here][1]][1]plt.title('Access Type Counts by Host/IP Address',y=1.05)

You are just giving sample data here, correct? Can it happen that you have Unauthorized and Authorized Access for one IP Address? Make sure your plot will be robust to such cases.

This article gives nice ideas on how to visualize categorical variables

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM