[英]How to count the number of instances for each category using group by in pydatadable
I have a dataframe as showed below, and here i wanted to apply group by and count operations on it get the count of each category in a pydatatable way?.我有一个 dataframe 如下所示,在这里我想应用分组并对其进行计数操作以 pydatatable 方式获取每个类别的计数?
here is a sample dt contains the different programming languages这是一个示例 dt 包含不同的编程语言
prog_lang_dt = dt.Frame({"languages": ['html','R','R','html','R','javascript','R','javascript','html']})
Here is a code that i'm trying to apply group and count operations这是我正在尝试应用组和计数操作的代码
prog_lang_dt[:,:,by(f.languages)]
Is there any count specific function for it in place of J... DT[i,j,by]是否有任何计数特定的 function 代替 J... DT[i,j,by]
The count()
method can be used to find the number of elements in each group: count()
方法可用于查找每个组中的元素数:
from datatable import dt, f, by, count
prog_lang_dt = dt.Frame(languages= ['html', 'R', 'R', 'html', 'R', 'javascript',
'R', 'javascript', 'html'])
prog_lang_dt[:, count(), by(f.languages)]
produces生产
| languages count
-- + ---------- -----
0 | R 4
1 | html 3
2 | javascript 2
[3 rows x 2 columns]
Although not needed for your example, but the function count
can also take a column as an argument, in which case it will report the number of non-missing entries in that specific column.尽管您的示例不需要,但 function
count
也可以将列作为参数,在这种情况下,它将报告该特定列中非缺失条目的数量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.