[英]pandas grouping aggregtation across multiple columns in a dataframe
[英]Grouping pandas DataFrame with Multiple Columns
比方说,我有这样的pandas
数据帧:
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Family | Genus | Species | hasHair | laysEggs | canFly | hasLongHorns |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Bovidae | Ovis | Sheep | 1 | 0 | 0 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Passeroidea | Passeridae | Sparrow | 0 | 1 | 1 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Ornithorhynchidae | Ornithorhynchus | Platypus | 1 | 1 | 0 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Bovidae | Ovis | Mouflon | 1 | 0 | 0 | 1 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Passeroidea | Passeridae | Passer | 0 | 1 | 1 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
我想“总结”数据以获得以下信息:
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Family | Genus | Species | hasHair | laysEggs | canFly | hasLongHorns |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Bovidae | Ovis | Sheep | 1 | 0 | 0 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| | | Mouflon | 1 | 0 | 0 | 1 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Ornithorhynchidae | Ornithorhynchus | Platypus | 1 | 1 | 0 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| Passeroidea | Passeridae | Sparrow | 0 | 1 | 1 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
| | | Passer | 0 | 1 | 1 | 0 |
+-------------------+-----------------+----------+---------+----------+--------+--------------+
如您所见,与实际的数据处理相比,这是更多增强可读性的布局:属性的值不变。 我只想制作一份更容易阅读的报告。
现在,我不确定如何解决这个问题。 任何人都可以提供一些指示吗?
谢谢!
R。
为了更容易阅读,您可以创建MultiIndex
并对其进行排序:
df = df.set_index(['Family','Genus', 'Species']).sort_index()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.