[英]Counting the number of integer values in a column depending on the value of another column in pandas
Hi i have a column which has values in both integer and text, I am trying to write a function in python where i can get the count of only integer values corresponding to another value in another column.
听到是数据的样本,
constructorId positionText
1 3
1 4
1 R
4 6
4 5
4 N
4 9
我希望结果是这样的
constructorID positionText_count
1 2
4 3
这是代码
def not_finished(c):
r = 0
for c in hybrid_era_results['constructorId']:
y = hybrid_era_results['positionText']
if isinstance(y, int):
r = r+1
return r
此代码不会引发错误,但是当我调用 function 时,它总是返回值 0。我做错了什么?
IIUC,您可以使用to_numeric
过滤数值,然后groupby.sum
:
(pd.to_numeric(df['positionText'], errors='coerce')
.notna()
.groupby(df['constructorId'])
.sum()
.reset_index()
)
output:
constructorId positionText
0 1 2
1 4 3
要设置自定义列名,请使用.agg(positionText_count='sum')
代替.sum()
(如果需要特殊字符的空格,请使用.agg(**{'positionText_count': 'sum'})
) :
constructorId positionText_count
0 1 2
1 4 3
由于没有关于数据的明确细节,这里有一个简单的例子来检查 col2 中 col1 的整数值是否存在,其中两列都是 str 类型。
import pandas as pd
# dimple example df
df = pd.DataFrame({"col1": ["value1", "4", "5", "3"], "col2": ["3", "6", "value2", "4"]})
# get the index where we have integer in both columns
idx_col1 = df.col1.str.isdigit()
idx_col2 = df.col2.str.isdigit()
# retrieve integer values
df_col1_int = df.loc[idx_col1, "col1"]
df_col2_int = df.loc[idx_col2, "col2"]
# get only values from col1 tht exists in col2 (the same index is not required)
idx_exists = df_col1_int.isin(df_col2_int)
df_exists = df_col1_int[idx_exists]
# get the number of integers in common
len(df_exists)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.