[英]Python / Pandas - Count number of rows with certain index
I have this dataframe: 我有这个数据框:
content
id
17 B
17 A
6 A
15 A
...
I want to count how many rows have the index 17 (in this case that would be 2). 我想计算索引为17的行数(在这种情况下为2)。 Is there a way to do that? 有没有办法做到这一点?
You can groupby level 您可以按级别分组
df.groupby(level=0).count()
Or reset_index()
或reset_index()
df.reset_index().groupby('id').count()
You can try: 你可以试试:
sum(df.index == 17)
df.index == 17
returns an array with boolean
with True
when index value matches else False
. 当索引值匹配else False
时, df.index == 17
返回一个boolean
值为True
的数组。 And while using sum
function True
is equivalent to 1
. 并且在使用sum
函数True
等于1
。
Input: # Your DataFrame
test_dict = {'id': ['17', '17', '6', '15'], 'content': ['B', 'A', 'A', 'A']}
testd_df = pd.DataFrame.from_dict(test_dict) # create DataFrame from dict
testd_df.set_index('id', inplace=True) # set 'id' as index in inplace way
testd_df
Output:
|content
--------------
id |
-------------
17 | B
17 | A
6 | A
15 | A
pandas.Index.value_counts
解决方案:使用api pandas.Index.value_counts
Based on the document, pandas.Index.value_counts
will return object containing counts of unique values and return a pd.Series
. 根据该文档, pandas.Index.value_counts
将返回包含唯一值计数的对象,并返回pd.Series
。
so now, I can select the specific index I want by using pandas.Series.loc
(not get confused with .iloc
) 所以现在,我可以使用pandas.Series.loc
选择想要的特定索引 (不要与.iloc
混淆)
# Solution
Input: index_count = pd.Index(testd_df.index).value_counts() # count value of unique value
index_count
Output: 17 2
15 1
6 1
dtype: int64
---------------------------------
Input: index_count.loc['17'] # select the information you care about
Output: 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.