計算列中的特定值並將結果制成表格 Pandas Python

Question

我有一個包含 1000 萬行的數據集。 我想計算某些數字在 Values 列中出現的次數，同時創建一列結果。 具體來說，我想計算 0 和直到 100 000 的所有數字在Value列中出現的次數。 以前我使用excel並使用公式=Countif(A:A,row(a1))

使用代碼計算特定數字非常簡單：

df.loc[df.Values == '21288', 'Values'].count()

我害怕在嘗試以下代碼之前使我的計算機崩潰，因此會要求您判斷它是否正確。

import pandas as pd
df = pd.read_csv('Hello world')

for index in df.index:
    df['Counts'] = df.loc[df.New_Value == df.loc[index,'New_Value'], 'New_Value'].count()

Answer 1

您可以使用value_counts ：

輸入數據：

Value_counts ：

>>> df.Values.value_counts()
#output
26972    2
55795    2
28446    1
78957    1
54796    1
32698    1
75894    1
78469    1
28784    1
Name: Values, dtype: int64

過濾value_counts結果：

value_df = df.Values.value_counts().to_frame().astype(int)
#results only below 40000
value_df[value_df.index < 40000]

       Values
26972       2
28446       1
32698       1
28784       1

如果要將另一個Count列添加到原始數據框中。

#creating a dictionary based on the value counts
>>> d = df.Values.value_counts().to_dict()

#mapping the count to the Values columns
>>> df['Count'] = df.Values.map(d)

輸出：

    Values  Count
0    54796      1
1    78957      1
2    75894      1
3    78469      1
4    26972      2
5    28446      1
6    28784      1
7    55795      2
8    32698      1
9    55795      2
10   26972      2

用你的方法確認：

>>> df.loc[df.Values == 26972, 'Values'].count()
2
>>> df.loc[df.Values == 55795, 'Values'].count()
2

對於您的V100.csv ：

df = pd.read_csv('V100.csv',delimiter=',')
df = df.apply(pd.to_numeric, args=('coerce',)).dropna()
df = df.astype(int)

print(df['Fives'].value_counts())
print(df.loc[df.Fives == 9100, 'Fives'].count())

9100     2445
9200     2401
100      2394
1100     2350
8200     2315
         ...
43855     862
33866     860
74277     858
47922     857
53011     834
Name: Fives, Length: 9910, dtype: int64

2445

請注意， 9100計數是相同的。

計算列中的特定值並將結果制成表格 Pandas Python

問題描述

1 個解決方案

解決方案1
1 2020-10-19 17:25:39

計算列中的特定值並將結果制成表格 Pandas Python

問題描述

1 個解決方案

解決方案1 1 2020-10-19 17:25:39

解決方案1
1 2020-10-19 17:25:39