简体   繁体   English

如何在图中表示布尔数据

[英]How to represent boolean data in graph

How can I represent below data in comprehensive graph? 如何在综合图中表示以下数据? Tried to with group by() from Pandas but the result in not comprehensive. 与Pandas的group by()尝试过,但结果并不全面。 My objectif is to show what causes the most accidents between below combinations 我的目的是要说明在以下组合之间导致最多事故的原因

pieton  bicyclette  camion_lourd  vehicule
0       0           1             1 
0       1           0             1
1       1           0             0
0       1           1             0
0       1           0             1
1       0           0             1
0       0           0             1
0       0           0             1
1       1           0             0
0       1           0             1

y = df.groupby(['pieton', 'bicyclette', 'camion_lourd', 'vehicule']).size()
y.unstack()

result: 结果:

在此处输入图片说明

在此处输入图片说明

Here are some visualizations that may help you: 以下是一些可以帮助您的可视化效果:

#data analysis and wrangling
import pandas as pd
import numpy as np

# visualization
import matplotlib.pyplot as plt

columns = ['pieton', 'bicyclette', 'camion_lourd', 'vehicule']
df = pd.DataFrame([[0,0,1,1],[0,1,0,1],
                  [1,1,0,0],[0,1,1,0],
                  [1,0,0,1],[0,0,0,1],
                  [0,0,0,1],[1,1,0,0],
                  [0,1,0,1]], columns = columns)

You can start by seeing the proportion of accident per category: 您可以从查看每个类别的事故比例开始:

# Set up a grid of plots
fig = plt.figure(figsize=(10,10)) 
fig_dims = (3, 2)


# Plot accidents depending on type
plt.subplot2grid(fig_dims, (0, 0))
df['pieton'].value_counts().plot(kind='bar', 
                                     title='Pieton')
plt.subplot2grid(fig_dims, (0, 1))
df['bicyclette'].value_counts().plot(kind='bar', 
                                     title='bicyclette')
plt.subplot2grid(fig_dims, (1, 0))
df['camion_lourd'].value_counts().plot(kind='bar', 
                                     title='camion_lourd')
plt.subplot2grid(fig_dims, (1, 1))
df['vehicule'].value_counts().plot(kind='bar', 
                                     title='vehicule')

Which gives: 这使:

在此处输入图片说明

Or if you prefer: 或者,如果您愿意:

df.apply(pd.value_counts).plot(kind='bar', 
                                     title='all types')

在此处输入图片说明

But, more interestingly, I would do a comparison per pair. 但是,更有趣的是,我将对每对进行比较。 For example, for pedestrians: 例如,对于行人:

pieton = {}
for col in columns:
    pieton[col] = np.sum(df.pieton[df[col] == 1])
pieton.pop('pieton', None)
plt.bar(range(len(pieton)), pieton.values(), align='center')
plt.xticks(range(len(pieton)), pieton.keys())
plt.title("Who got an accident with a pedestrian?")
plt.legend(loc='best')
plt.show()

Which gives: 这使:

在此处输入图片说明

The similar plot can be done for bicycles, trucks and cars, giving: 可以对自行车,卡车和汽车进行类似的绘制,从而得到:

在此处输入图片说明 在此处输入图片说明 在此处输入图片说明

It would be interesting to have more data points, to be able to draw better conclusions. 拥有更多的数据点,以便能够得出更好的结论将是很有趣的。 However, this still tells us to watch out for bicycles if you are driving! 但是,如果您开车,这仍然告诉我们要小心自行车!

Hope this helped! 希望这对您有所帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM